Exploring PoseNet & P5.js & ml5.js

For this workshop I chose to continue my explorations of PoseNet, which allows for real-time human pose estimation in the browser, having started learning the framework in the Body-Centric Technologies class. I wanted to try working with images and PoseNet but I couldn’t get the PoseNet model to track whenever I used images…my workaround for this was to group the images into a video. My idea was to explore how women posed on magazine covers by comparing poses from different fashion magazine covers. Below is a video showing the final results of poses that were captured when working with the PoseNet with webcam example.

I found that the body-tracking worked well only if I had the size of the video set to width (640 pixels) and height(480 pixels) which were the dimensions  used in the ml5 examples.

What is ml5.js? A wrapper for tensorflow.js that makes machine learning more approachable for creative coders and artists. It is built on top of tensorflow.js, accessed in the browser, and requires no dependencies installed apart from regular p5.js libraries.

NOTE: To use ml5.js you need to be running a local server. If you don’t have a localhost setup you can test your code in the p5.js web browser – you’ll need to create an account.

I also found that the multi-pose tracking seemed to tap off at 3 poses max tracked whenever there were more than 3 poses. Additionally, the model’s skin color affected the tracking so that at times some body parts were not tracked. I also found that the model’s clothes also affected whether some parts were tracked or not. At times the models limbs were ignored or the clothes were tracked as additional limbs. The keypoints seemed to be detected all the time but the lines for the skeleton were not always completed. What are keypoints? These are 17 datapoints that PoseNet returns and they reference different locations in the body/skeleton of a pose. They are returned in an array where the indices 0 to 16 reference a particular part of the body e.g in the array index 0 contains results about the nose such as x,y co-ordinates and percentage of detection accuracy.

Below are some of the images I tested with:


I’d like to continue working on this however I would like to explore using OpenPose which is a framework like PoseNet that provides more keypoints tracked as compared to PoseNet’s 17 keypoints. From my working with PoseNet so far, I find that it is more beneficial in areas where you aren’t tracking a skeleton but are doing something more with the data gotten back from keypoints e.g. right eye is at this x and y position so do certain action.

I tried some of the other ml5 examples however I wasn’t satisfied with the results. I was particularly interested in the style transfer and the interactive text generator. However, I found that in order for them to be useful to me, I would have to train my own custom models and I didn’t have the time and adequate dataset to do this.

I also tried out the video classification example where I was toying around with the idea of having the algorithm detect an image in a video and explored for a video classification. It quickly dawned on me that this was a case for a custom model as the pre-trained model seemed to only work best when generic objects were in view. e.g. At times it recognized my face as a basketball, my hand as a band-aid, my hair as an abaya etc. I also noticed that if I brought the objects closer to the screen, the detection was slightly better. Below are some of my findings using MobileNet Video Classification in p5.


Pros & Cons of using a pre-trained model vs. a custom model? When using a pre-trained model like PoseNet & tensorflow.js a lot of the work has already been done for you. Creating a custom model is beneficial only if you are looking to capture a particular pose e.g. If you want to train the machine on your own body but in order to do this you will need tons of data. Think 1000s or even hundred of thousands of images, or 3D motion capture to get it right. You could crowdsource the images however you have to think of issues of copyright and your own bias of who is in the images and where they are in the world. It is imperative to be ethical in your thinking and choices.

Another issue to keep in mind is diversity of your source images as this may cause problems down the line when it comes to recognizing different genders or races. Pre-trained models too are not infallible and is recommended that you test out models before you commit to them.

Word of the Day

For this project, I wanted to explore fetching data from Adafruit IO to use in a p5.js sketch. While exploring IFTTT I noticed that most of the services in the “that” section were either very restricted in their feature offerings or they were tied to a particular IOT home device. I decided to try receiving data from IFTTT via my Adafruit IO account.

My project displays the word of the day from Wikipedia’s Wiktionary website. The word received is then used displayed on a p5.js sketch, showing its definition and

from this project I learned how to use XMLHttpRequest and parse response data while getting data from Adafruit IO.

  • Testing saving multiple values

I created a new applet using Gmail and Adafruit to collect email sender and email subject headline. When sending the data in the “Add ingredient” tab of the applet creation, I realized that I needed to add delimiters to my data so that I could send multiple values in one applet trigger. This is shown in the data below:



Results from testing showing data with a delimiter and data without

  • Sending values back to p5.js script

To get values from Adafruit IO i made a “GET” XMLHttpRequest() to the following endpoint -> https://io.adafruit.com/api/feeds


Note: To access the response data, I noticed that I had trouble getting the returned data when trying to pass the incoming data to the reqListener function as a parameter. When doing this, I wasn’t able to print any of the received data to the console, however I realized that when referring to the current object as this.responseText, I was able to access the returned data by using a reference to the JSON keys to refer to the elements i.e feed.name and feed.description


The results printed to the console are shown below.


  • To get a specific feed I use the following url:

var url = (“https://io.adafruit.com/api/feeds/emails” + “?x-aio-key=” + AIO_KEY);

The JSON response was then parsed and printed to the screen. Below is the result of the test showing the sender and email subject of the last received email in my school email account.



Once I had my proof of concept. I switched to my Wikipedia Applet that returns the word of the day.



Adafruit IO API Docs : here

XMLHttpRequest : here

Link to code: here

AM to PM ( Digital Postcards)


AM to PM is a visualization of local time all around the world using PubNub and the Wolpfram Spoken API to query the current local time of countries around the world. Click on a country to see the local time visualized as a lit up sky with colors indicating the time of day.


My first idea was to create a project under the theme of wildlife potentially wildlife conservation however I was unable to find a suitable wildlife data api to query. I tried querying Wolfram alpha but the results returned using the spoken API were very unpredictable and I wasn’t able to predict what kind of response I would get. eg. When I asked “How big is a lion?” I was able to get the response

The typical length of a lion is about 9.8 feet

however when I queried “How big is a buffalo?” I got the response

The area of Buffalo, New York is about 40.4 square miles

This showed me that the spoken API would not be beneficial for my purpose. It returned conversational answers that are more suited to an app where voice is being used as a feature. I decided to then limit my queries to countries and their local time using the query ” local time country name” e.g “local time Kenya” returns

The answer is 7:14:48 P.M. EAT, Tuesday, February 5, 2019

To work with this response, I split the response at each ” ” and then find the position of the string ‘is’, once this string is found, I then get the next index i.e response[i+1] to get the time (7:14:48) and response[i+2] to get the string ‘A.M.’ or ‘P.M.’.

The time variable is then split at each “:” where I use the hour value and “A.M.” boolean check to determine what color sky to display. Each sky color varies per hour with dark black – blue – purple skies for night time, purple – pink skies for sunrise – blue skies for mornings – yellow skies for midday – orange skies for evening and purplish – orange skies for sunsets.

I would like to expand my project to query more climate data perhaps using the Wolfram Full Details API which I was not able to get working for this project. I didn’t query this data with the spoken API as i wasn’t able to predict what response I would get for each country, hard coding an algorithm like I had done to extract the time with the “local time” query becomes much more difficult with my larger dataset of 244 countries due to the varied responses e.g:

“Current weather in Kenya” returns

The weather in Kenya is mostly clear, with light winds and a temperature of 73 degrees Fahrenheit

“Current weather in United States” returns

The weather in the United States ranges from fog to overcast, with light winds and a temperature of 20 degrees Fahrenheit

Code / Process – Challenges & Experience:

Initial idea was to work with wild animals – couldn’t find data

I wanted to map the mouse position to latitudes and longitudes but i scrapped this idea because: 1. would end up making too many api calls 2. found that giving (lat, long) values to wolfram didn’t return expected results. Wolfram just reinterpreted the value and can’t return a country based on a latitude and longitude. One can’t query what country is a specific latitude.


I thought of using the getLocation() geolocation function to get GPS co-ordinates but went against it as this would tie me to the country linked to my IP address and I wanted my sketch to be able to change dynamically to show data on various countries. I ended up translating latitude,longitude values from Google Developer’s dataset countries.csv, using this data I translated the lat-long to x-y screen co-ordinates using the Mercator map projection function.  Below is an example of how the co-ordinates translated to 2D space.


After determining that the translation function was working correctly, I created a Country class, generating new countries from a .csv file and saving them in an array when the sketch was loaded. Each Country object had the following attributes : lat, long, country code, country name, x co-ordinate, and y co-ordinate.

In my Country.translate() function, I put the code to translate from mercator lat,long to 2D x-y co-ordinates. In the Country.display(), I placed code to draw a circle at a specific x,y co-ordinate.

Upon drawing the dots for the countries I realized that they were skewed off the map as the x,y values corresponded to a Cartesian plane. To fix this I used the map function to map the x and y values to the screen coordinates.


Mapping the x-values to between 0 and screen width.


The final x,y values mapped to the screen’s width & height.

The next step was to make each country dot clickable so that when a user clicked on a country dot, its country name would be passed to Pubnub to query the wolfram api.


I then created a Country.clickCheck() function that is triggered on mouseClicked() to check which dot has been clicked. (see image above). This is done by checking the mousex,y location against a designated area around the x,y coordinate of the country dot. Upon finding a country that has been clicked, the country name, which is saved in a global variable that is constantly updated each time a user successfully clicks on a country, is passed to the Pubnub Wolphram Alpha query.

I also updated my code so that when a country is clicked, it is highlighted by a black dot to make the visualization more meaningful.


Kenya’s sky at 8:00:43 P.M EAT on Tuesday, February 5, 2019

Aesthetic choices made:


To experiment with a new way of asking a question, once I had constrained my project to local time and countries, I wanted to keep away from asking the user to type in a country. So i settled on a visualization of a map taking advantage of the affordances of a map. I wanted to keep it abstract so instead of country shapes, I created Country objects that were visually represented on a 2D map and unexpectedly, just by looking at the dots you can almost tell which continent is which. My goal was to keep it aesthetically pleasing but still informational. I am toying with the idea of printing the time values  but I think the colors are pretty easy to interpret.

Possible future expansions:

I’d like to explore, adding more APIs to this project.

Link to code: here


Natural Logarithm in p5 : here and here

Mercator Map projection: Lat,Long -> x,y screen co-ordinates: here

Working with / Reading from CSV files: here




A “Silent” Alarm / An XBee x Arduino Process Journal


This little device is a simple desk companion that acts as a visual silent alarm for the hearing impaired. In theory it would react to a doorbell and spin indicating that someone was at the door or that mail had been left. Alternatively, it also acts as a distracting toy that reacts to notifications.

XBee Chat Exercise:

During the chat exercise, my partner and I didn’t have much trouble configuring our XBees and were soon able to send messages back and forth. However we didn’t really do much chatting as we noticed that we weren’t able to send long messages because packets would get dropped. One of us would send “Hello” and the other would receive “Hell”. It was interesting to see how this led to funny miscommunications. This led me to the conclusion that the XBee wasn’t really a communication device in the traditional sense of the word. I would have to think of communication beyond words. We found that the most effective way was to send characters.

XBee Radio Receiver / Transmitter:

Tip: Use the edge of the breakoutboard not the XBee to determine where to put the power, TX, and RX pins for the XBee connection.

While testing the XBee using the Arduino Physical Pixel example, I was able to control an LED bulb using the serial monitor, however when trying to control the Arduino x XBee setup using another XBee we ran into some issues. We were able to only achieve one way communication. My LED would light up or go off when receiving signals from my partner’s XBee but I was not able to light their LED using my radio. This was also happening with another group.

Troubleshooting (Tested with 3 different partners as the one-way communication issue occurred each time I tried to connect to a new XBee)

We noticed that:

  1. The radios would work when both were configured on the same laptop.
  2. (Best troubleshooting method) The radios would work after sending messages back and forth over chat.
  3. The radios would work when brought closer together.

XBee Metronome Receiver

For my device’s design choice, I was thinking of the instructions we got in class; to think of an interaction that would be around 20 or more others. Initially, I had wanted to use sound as an output however I figured something more visual would be a better choice as it would still be able to be noticed when around other devices that were reacting to the metronome. Since I removed sound from the equation, and began to focus mainly on visuals, this made me think of the hearing impaired and I thought “What if you could have a tiny visual desk alarm that spins when someone rings your doorbell?”. I also wanted to learn how to work with a servo as I had never used one before.


When conceptualizing my design I had envisioned having a rotating cylinder or disk inspired by spinning tops and wheels, however, I realized that the micro-servo can only make 180 degree rotations not 360 degree as I had imagined. I didn’t have the knowledge to hack my servo or the time to get another one so I improvised the rotations to still create an optical illusion effect. Below are some images from my prototyping.

Future expansion:

I would like to continue to explore making desktop companions thinking along the themes of accessibility and self-care toys. I’d also like to work with more servos.

Github link to code : here