Objective & Concept
Through PoseNet I wanted to explore networking with computer vision to visualize multiple people watching a screen together. This project tracks the position of the nose on every unique browser page and shares the data amongst the connected users. The tracking allows for users to be aware of the physical position of where the other users are. This creates a spatially aware sensation by either encouraging others to follow other “noses” or for users to move away and create their own space on the browser page.
I followed along on Daniel Shiffman’s Coding Train tutorial where he explores the concepts of what is PoseNet, what is the data that is given, and how can you visualize the data. In his example, he visualizes a nose through an ellipse that follows the nose along with the user on the screen.
The most interesting (and surprisingly simple) part of PoseNet is that it simply changes your body into what could be perceived as “cursors” on the screen (x & y coordinates).
There are a few different examples of how to expand on this data with P5, such as creating a light nose. These examples with PoseNet are interesting because it uses the body as a controller. I had previously explored PoseNet in my Body Centric class in this blog post here where I expanded upon this concept.
For this project, to emphasize the ubiquity of the webcam, I wanted to explore what would it look like to have multiple people be visualized on the screen together.
Starting with Shiffman’s code for tracking the nose, I used this to create the ellipse that follows the nose along with the user on the open browser.
I then found the x and y coordinates of the nose and connected my page to PubNub to publish and subscribe to the position of the ellipse.
I followed along in a previous example from my Creation and Computation class in the fall that tracks how many users are on a webpage using PubNub. In this example, every user loaded on the page sends the x and y coordinates of their cursor on the webpage when they click. The code then takes the average of all the user coordinates and draws lines to the average point of all the cursors.
I connected my code of the nose to PubNub and sent the coordinates of my nose. I chose a nose because it is the center of the face, and most accurately depicts where the user is in relation to their screen. Upon receiving the subscribed data I would check to see if there was a “new nose”. If there was a new nose, that user would be added into the active user array of “All Noses”. Upon every time a message from PubNub was received I would check to see if their ID was in the array and if so then I would update the coordinates of where they are on the screen.
Two noses chasing each other on the page.
The code then loops through the array and draws an ellipse with the received/sent coordinates of the user’s noses. When the user leaves, the ellipse stays there which shows a trace of all the users that have been active on the page.
Along with sending the x & y coordinates, I also sent along to PubNub the RGB values of the user’s nose ellipse. This was to differentiate the different user’s on the page and also allow the user’s to uniquely identify themselves on other’s browsers.
Video documentation of moving around another nose.
The interaction of the two noses was interesting because it prompted either an aversion of the different noses overlapping or an urge to make the dots touch. The action of moving your nose along the screen was not as direct as it was perceived. The motion was laggy, which prompted by jerky motions from the users.
This experiment was an initial exploration into mapping multiple physical spaces together into a virtual space. In further iterations, I would make an action occur if the different positions of the noses overlapped on the browser page. I chose not to this time because I did not want to make anything that could be interpreted as a game mechanic. I wanted to see what the actual reaction would be amongst a few users. In another iteration, I would include other parts of the body to track, such as eyes, or wrist. The single tracking of the nose was effective for tracking the facial position of the user which is the most common angle seen from sitting down at a webcam (while working at a computer).
Overall I am interested in exploring networking computer vision technologies further in a pursuit to examine the multiplicity of spaces and existences we inhabit simultaneously.