Machine Learning: PoseNet

By: April and Randa
Link to sketch code: https://editor.p5js.org/aprildezen/sketches/gkVoUOBeo

Strategy
To explore and understand how PoseNet works and how it can be altered and changed. We don’t have a lot of coding experience so we decided to take this slow and really understand the tutorial by Daniel Shiffman. Once we understood that we wanted to see what the limitations were and what we could build using this platform.

screen-shot-2019-03-09-at-2-56-00-pm

Documentation
Following the tutorial, we went step by step to create the canvas and set up the video capture. Before moving on we played with a few of the filters provided in the P5 library.

screen-shot-2019-03-09-at-2-17-46-pm

Once that was set up we moved on to setting up PoseNet. By using TensorFlow.js PoseNet model, the computer pulls up poses that match ours in real time, and displays skeleton-like graphics of our body. PoseNet can be used to estimate either a single pose or multiple poses, but just for simplicity we experimented with the single pose that detects only one person in a video.

We copied into the p5.js editor’s index.html file the below url for the ml5.js library: <scriptsrc=https://unpkg.com/ml5@0.1.3/dist/ml5.min.js”type=“text/javascript></script>

Importing this library allows the program to detect and trace keypoints positions in the face and body using the webcam. Then one can create interactive art or graphics that respond to body and face movement.

All keypoints are indexed by part id. The parts and their ids are:
screen-shot-2019-03-10-at-7-37-41-pm

When it detects a person, it returns a pose with a confidence score and an array of keypoints indexed by part id, each with a score and position.

screen-shot-2019-03-10-at-7-37-50-pm

This information allowed us to identify the positions of each pose keypoints, and also the confidence score of detecting them. After retrieving this information we played around with the code and tried out assigning shapes to different positions and keypoints. We learned also that to make shapes scalable we need to input their size to be a variable defined by the distance between the nose and an eye.

screen-shot-2019-03-09-at-2-25-30-pm

This is another example we explored that creates wobbly eyes.

screen-shot-2019-03-09-at-2-28-35-pm

Insights
We were both really blow away with how accurate PoseNet’s location detection was. We both played with it alone on our computers in class and it was eire how other it was right.

However, There is a lower accuracy detecting keypoints of the body with multiple people then there is with just one person. We played around with a few demos and noticed the lag time increase when we were both in view. In the video above, the clown nose was juggling between both our noses, trying to detect the more confident output. It actually was kinda a fun effect, we felt it felt like playing digital ping pong with our noses.

Now that we played with the demos, time to see what we can do with it.

WELCOME TO OUR ALIEN ABDUCTION.

screen-shot-2019-03-09-at-2-33-05-pm

Unfortunately, we forgot to hit record on this part because we had a lot of fun trying to figure out how target other points in the PoseNet library and add silly alien anteni and understand how the example code for the googly eyes worked.

Here is the breakdown of what we went through to become extraterrestrials.

Step 1: Locate where the ears are
Using the example code from the hour of code tutorial and the index of part id’s, we could easily locate the ear location – Left Ear 3 and Right Ear 4. As a placeholder we drew two ellipses over the ears to make sure it was working.

Step 2: Create anteni above our ear location
Now that we had red ellipses floating on top our ears, we needed to figure out a way to move them up above the ears. Since we just wanted to move them up, we only needed to change the Y axis. In the string that indicates the placement of the ellipses over the location of the ear, we made one slight change. We multiplied the the ‘ear1Y’ variable by .5 and that gave us enough height.

ear(ear1X, ear1Y*0.5);
ear(ear2X, ear2Y*0.5);

To create the anteni we simply change the x and y height so that the ellipse would we long and skinny and change the red colour to a RGB value that will give us bright green.

function ear(x, y, size, n) {
fill(164, 244, 66);
noStroke();
ellipse(x, y, 5, 100);
}

Step 3: Bring in the googly eyes
The googly eyes was part of the demo code that was included in the hour of code tutorial we watched on PoseNet. All we changed for this was the frame count to 2 so the eyes would spin faster and we changed the eye colour to green.

function eye(x, y, size, n) {
let angle = frameCount * 2;
fill(255);
noStroke();
ellipse(x, y, size, size);

fill(164, 244, 66);
noStroke();
ellipse(x+cos(angle*n)*size/5, y+sin(angle*n)*size/5, size/2, size/2);
}

Step 4: Add a filter to the video capture
The last thing we did to achieve this strange alien look was to add a filter to the draw function.

filter (INVERT);

Final Obduction #OOTD

screen-shot-2019-03-10-at-7-16-48-pm

Information sources
https://p5js.org/learn/color.html
https://p5js.org/examples/dom-video-canvas.html
https://p5js.org/reference/#/p5/filter

Reflection
Overall, we had a lot of fun playing with this. There is so much more we can do with it and though this learning experience we noticed how achievable it was for us to use it. Neither of us know much about javascript but we were able to figure out how it worked and how to start making changes. We think it would be interesting to continue exploring this tool.

PoseNet Experimentations

Workshop 6 Notes
GitHub

PoseNet Experimentations: Replacing the mouse with different tracked body points, and using gesture as a type of controller through PoseNet js.

Concept

PoseNet js is a position tracking software that uses real-time data analysis to track 17 key points on your body. The key points are translated into a javascript object which documents at what pixel position each point is currently at on the webpage. These experiments attempt to see if these pixel positions can replace the computer mouse functionality and create new control gestures through tracked body movement.

Process

I introduced myself to PoseNet through Daniel Shiffman’s coding train example that uses PoseNet with p5.js. This example was thorough and explored the base concepts of:

  1. What does PoseNet track?
  2. How do you extrapolate that data?
  3. How can you use the key data points analyzed through computer vision in relation to another?

A clip of myself following Daniel Shiffman’s PoseNet example. Note how the code is written to isolate certain keypoints in “keypoints[1]” etc. 

PoseNet uses computer vision to document the following points in the form of “pixels” on your body with a declared accuracy. The library for PoseNet allows you to identify the poses of multiple people in a webcam. My explorations will be focussing on actions and gestures from one participant.

A list of the key points on the body that PoseNet analyzes and creates positions for in relation to the video.
A list of the key points on the body that PoseNet analyzes and creates positions for in relation to the video.

This data is then translated into JSON that the PoseNet library instantiates as a data structure and that beginning of your code.

A screenshot of the JSON that PoseNet provides that documents the coordinates.
A screenshot of the JSON that PoseNet provides that documents the coordinates.

The JSON gives you an x and y coordinates that indicate the position of each point being tracked. Because of the way the JSON is structured you are always able to accurately grab points to use as inputs.

I found that the points are very similar to p5.js mouseX and mouseY coordinates. I decided to pursue a series of small experiments to see how to use gestures as a mouse movement on the screen.

Screen position & interaction

I wanted to see how I could use the space of the screen to interact with my body that was analyzed for data points on a webcam. I wanted to explore raising my hand and see if I could activate an event within a part of the screen. Interestingly enough, PoseNet does not track the point of a user’s hand but is able to track their wrist. I decided to experiment with the wrist point as it was the closest to the hand gesture I wanted to explore.

I used the method in Shiffman’s tutorial to isolate my wrist (key point 10). I then divided the webcam screen into 4 quadrants. I found the x and y coordinates of my right wrist. If the coordinates of my right wrist were located within the top left* quadrant, an ellipse would appear on my wrist.

*Note, the webcam image is a flipped image of the user. The right hand is shown on the left side of the screen.

Video documenting myself activating the top left quadrant with my hand. 

I expanded on this idea by seeing how I could use this gesture as a controller to stop and start an action.  I chose to draw an ellipse centred on my nose. Upon page load, the colour of the ellipse would randomly be chosen. If the coordinates of my right wrist were located within the top left quadrant, the colour of my nose ellipse would start to change.

Changing the colour of my nose by using the gesture action of raising my right wrist to randomly chose the red, green, and blue values. 

When the wrist leaves the quadrant, the nose takes on the last colour that was randomly chosen. This was a quick proof of concept. The loop happens quite quickly so that the user has little control over colour is selected.

The right wrist point was an interesting key point to chose. I found myself always tracking my hand rather than my wrist. Upon testing, I noticed myself having to raise my hand higher than I anticipated to activate the quadrant. I do not think this is because of the tracking accuracy, but rather the human discrepancy vs the machine understanding of where does my wrist start and end.

2) Buttons (almost)

I was able to isolate a specific quadrant on the page and could start and stop actions. I was curious to see if I could emulate a button click, and give further didactic information through HTML inputs.  

I created a button that would change the colour of the nose upon hover. I found the coordinates of the button, the width and height, and then compared this to the value of the nose pose. For this experiment, I chose the nose rather than the wrist because I wanted a one to one relationship with the control of the action and the gesture (rather than having a different body part control the output of another point).

As per the previous experiment, p5.js generated a random colour for the nose. When the nose interacted with the button on the page that declared “change colour of nose”, the colour of the nose would change colour.

PoseNet tracking my nose and interacting with a button created in p5.js

This button is an artificial stand-in since there is no mouse press event that is fired. It is simply tracking through position. I considered attempting to feign a mouse press, but chose to not because of the outcome of this design was the same desired result.

3) Mapping

My next experiment went to further gesture through the exploration of mapping. I found the nose ellipse colour to be an effective representation of action on the page. I chose to map the coordinates of my hand across the screen to the colour of the nose ellipse.

A screenshot showing the coordinates of the right elbow.
A screenshot showing the coordinates of the right elbow.

A video of my wrist controlling the colour of my nose through a sweeping gesture. 

I enjoyed how this emulated a slider. P5.js has the capability to include a slider for volumes and other values, such as RGB. This gesture almost replaces the functionality. The next step for this gesture would be to “lock” the value of the slider rather than just having a continuously mapped value.

4) Two body points interacting together

For this experiment, I wanted to explore two parts: the first, how was it possible to create an “interaction” with two points; the second, is it possible to trigger an output like a sound?

To achieve this, I chose to see if I could use my knee and my elbow to create a gong sound when they “collided” together.

I started off by finding the coordinates for the knee and the elbow. The x and y coordinates provide very specific points, it is not a range, nor is there an identifiable threshold that is given. These two sets of coordinates are not tied to the screen as the previous experiments and would need to respond in relation to each other.  This posed a bit of an issue, how would I determine if the knee and elbow were colliding into each other? It would be too difficult and unintuitive to attempt to have both the knee and elbow interact when they had the exact same coordinates. I decided that the knee would need to surpass the y coordinate of the elbow marginally to indicate a connection between the two points.

Two start the visual representation of this experiment, I did not draw anything on the screen. When the knee and the elbow “interacted”, an ellipse on the knee and elbow would draw.

A video of my knee and elbow attempting to trigger the drawing of two ellipses.

I found that this was effective every other attempt. I added ten extra pixels in each direction to see if this would help expand the range. This helped only marginally.

My next step was coordinating sound. I found a free gong sound file that I loaded into p5. This, unfortunately, did not work as I wanted to. Since I was triggering a sound through the .play() function, when my knee and elbow collided for more than a 1/60 of a second, the .play() function was executed. This was the beginning of a journey of callbacks and booleans that I decided not to include as it was deterring me from the exploration of PoseNet.

The knee and the elbow colliding was not as satisfying as an action or of use as a controller as I would have liked. The action itself is very enjoyable and having the sound of a gong would turn the body into an instrument. This action would work better if the sound was continuously playing, and then body controlled the volume or other aspects of the song rather than controlling the “starting” and “stopping” of the audio.

Conclusions

PoseNet provides a useful data model that allows people to located their key body points on a screen. The points are accurate coordinates that can be used to interact with the screen without having to use a mouse. The experiments I conducted are initial explorations into how it is possible to use the data from the PoseNet as inputs. The explorations can be used as basic models for common mapping functions, mouse clicks, and multiple input events. The output of the colour changing nose can be replaced with outputs such as sliders, changing filters, or starting and stopping events. PoseNet is a powerful tool that can take complex physical data and create a virtual map of coordinates for people to interact with online.

My next steps in these explorations would be to explore more complex and combined patterns of interaction. All of these explorations are single actions that provide one result. PoseNet offers the ability to use up to seventeen points of data from one person, which can be multiplied with every person interacting on the same screen. The multiple poses could lead to interesting group cooperation exercises, such as all participants raising their left wrist to create an output. The basic experiments of input and output still apply, though the logic would need to change to account for multiple bodies.

Overall, gesture-based controllers seem futuristic but the technology has become extremely accessible through PoseNet and built-in webcams. PoseNet allows an easy introduction to browser-based computer vision applications. These experiments offer a very basic understanding and introduction into common gesture interactions that now become possible through PoseNet.

Research 

PoseNet documentation

How to Build a Gesture Based Game Controller 

Real-Time Estimation Using PoseNet

FaceTracking

Context

In a world full of interactive devices we find ourselves surrounded with sensors, joysticks, screens, etc. For this assignment, we decided to explore a different kind of input that does not require the user to press any buttons or screens, instead, we wanted to explore using the camera and user’s face to send commands to a computer. FaceTracking is a p5 application that uses the computer’s camera and an algorithm to read and understand the user’s face. This application is currently set to play a sound with the user’s face movement, however, any function can be added to this application.

FaceTracker
FaceTracker

Objective

In this project, we will explore using the user’s face, as a controller to send commands to the device. This tool is a basic prototype but has potential to be scaled to include any number of functions that run based on the user’s face manipulation and movement.

This tool is based on the FaceTracking application using Haar Detection technique, which uses an algorithm that contours the user’s eyes, nose, mouth, eyebrows, and chin. Each element of the face is given a number and then a vector is drawn connecting the numbers. Using this tool, we were able to make a simple beat player that the user can play simple music with.  

 

Link to Code

 

Design Process

High-Level Computer Vision focuses on a complex analysis of images. When talking about CV and faces, there are three major sections:

1)      Detection: spotting the difference between a face and non-face,

2)      Recognition: distinguishing different faces,

3)      Tracking: a combination of detection and recognition over time.

We wanted to explore the face tracking option and create a controller using our faces. We started with Kyle McDonald’s Face Tracking Example.

4

3

We found this Class Notes from McDonald that explains all you need to know about CV and faces. OpenCV uses a Haar Detection technique, developed by Paul Viola and Michael Jones in 2001. Haar detection can be used for any two-dimensional object, but it can not handle any significant rotation or skew. It is also very limited on the colour variation that it requires. There is a video about HP Computers that could not follow Black faces.

The face tracking example identifies 70 points on the user’s face.

We took the key points that layout the face’s elements and draw contours around them. We didn’t notice a lot of change with the eyes, eyebrows, or the nose. But we were able to rotate the general contour of the shape. So we took the two points that defined the edges of the faces and compared them with each other. By comparing their Y position, we were able to identify if the face was tilting in any direction. After that, we added music to each direction so that the user would be allowed to play music by moving their heads.

 

Tools & Materials Used

P5.JS online editing tool

Github

Two Mp3 files

laptop

1

2

5

Challenges

Trying to determine which direction the face was angled either up or down was slightly confusing. We used a calculation to determine the exact point at where the controller would be activated but realized we did not need to after re-thinking our logic. (Omid can you elaborate on this part a bit? )

 

Future Steps

Future iterations could include a new beat everytime the user opens the page. So other could make a variety of beats together in front of the camera on their own devices. The webcam could also include more than one face and once again provide a random mix of  different beat set per controller, since it currently only recognizes one.

 

Useful links to look into:

 

OpenCV Website: https://opencv.org/

Kyle McDonald’s Class Note : https://github.com/kylemcdonald/AppropriatingNewTechnologies/wiki/Week-2

Kyle McDonald’s CV Examples: https://kylemcdonald.github.io/cv-examples/

OpenCV Face Detection: Visualized https://vimeo.com/12774628

How To Avoid Facial Recognition: https://vimeo.com/41861212

Face Detection For Beginners: https://towardsdatascience.com/face-detection-for-beginners-e58e8f21aad9

Subverting Body Tracking

Veda Adnani, Nick Alexander, Amreen Ashraf

Our response includes slide deck linked here.

We examined the field of countering computer vision (with a focus on face detection), began to speculate on further developments, and consider research and design projects.

Introduction

For our research on computer vision, we used a top-down approach. We started out trying to understand what “computer vision” is and what its implications are. Computer vision is the name given to a series of technologies which help the way a computer sees. The human eye is important to the way we use our visual understanding of the world to piece together information, in the same way, the camera is the eye of the computational device.

As of 2019, computer vision is all around us. Our smartphones, apps, social media, banks and other industries, use computer vision every day in aiding humans to carry out tasks with computational devices.

In Class Activity:

We started out by doing the class activity which was to research our topic. Some of the apps we looked at were those used commercially like the newly acquired app “…” by L’oreal. We also looked at the list “faces in new media” which is a list by Kyle Mcdonald (Face in new media art Kyle Mcdonald). The list is comprised of artists using computer vision in new and novel ways. In this list, there is a section on intervention which highlights arts using computer vision to counter tracking and thereby subverting these technologies.

Concepts:

We conducted a broad range of research to understand face tracking used by industries and governments to not just collect data but to classify humans and other potential uses of computer vision. Some initial concepts we jotted down were:

  1. Deepface: using facial recognition AI algorithms to alert or highlight when being detected.
  2. Blockchain: using blockchain technologies to scramble and save data on different databases for security.
  3. Physical: Using physical objects or clothing to misdirect.

 

Some interesting things we came across in this phase of the research was the way governments across the world are using computer vision. Privacy international is an NGO that does a lot of work with the legality of the ways in which computer vision is currently being implemented.

bodsub1

 

Instagram Face Filters:

Our first and most basic experiment was experimenting with Instagram face filters to understand the extent to which they can be used to alter, modify or even transform the face. One of the most striking filters that we found is shown below. It is called “Face Patch” and it gradually eliminates all the features from the user’s face leaving them only with a blank patch of skin and the outline of their head. We leave this finding open to your interpretation.

 

Beating Apple’s Facial Recognition

We tried deceiving Apple’s “True Depth” Face ID by using photographs, however this did not work. What did work was when we tried using a mirror to detect the face, and we found this odd since a mirror is a flat surface and cannot convey depth. Yet it somehow managed to cheat the software and unlock the device.

 

Modiface :

We experimented with Modiface an AR app that uses facial recognition to mockup different cosmetic products on the wearer’s face. A range of brands like INGLOT use this platform to advertise their products but what caught our attention was the apps ability to remove any scars and blemishes on the user’s face, even ones that the user was unaware of. It also allowed the user to change their eye colour if they desired. This was quite disturbing, and a rude awakening into the lengths that the beauty and cosmetics industry goes to, to promote vanity and unrealistic aesthetic perfection.

bodsub2

 

Free and Accessible Resources:

Accessible and free sources for body tracking are easy to find. Simple but robust face tracker tools made by independent developers, like CLM Face Tracker or Tracking.js are available with a minimum of web searching. More robust face tracking technology, such as that developed by Intel and Microsoft, is easily accessible by businesses. Body tracking code such as Posenet can also be found very easily.

For those who care to look, face and body tracking is widely available and can be adapted to a user’s purpose with no oversight.

 

Deceive v/s Defeat:

Through our research process, we came across two possible scenarios to subvert face recognition products. The first one was to “deceive” the intelligence into thinking that the user was someone else, and the second one was to “defeat” the system by rendering the user unidentifiable using certain tactics. Our findings below cover both of these possibilities.

 

In light of the examples listed below, we see an emerging need for subversion. Our identities, faces and bodies are sacred and personal. But we are constantly being violated by multiple entities, and it is unfair to be subjected to this kind of surveillance unknowingly. Where does this impending lack of trust leave humankind?

 

Amazon Rekognition:

https://aws.amazon.com/rekognition/

screenshot-2019-03-07-at-8-54-04-am

Amazon claims “ “Real-time face recognition across tens of millions of faces and detection of up to 100 faces in challenging crowded photos.” And was recently caught secretly licensing this facial recognition software to multiple state governments in the USA. With is real-time tracking and the ability to analyze several camera-feeds in multiple cities simultaneously, this is a serious concern for privacy and consent with government surveillance entities.

screenshot-2019-03-08-at-7-10-48-pm

Butterfleye:

homx1mjij7pbmapf0zgv

Using the same technology provided by Amazon’s recognition. Butterfleye is a B2C facial recognition device, that was built to help businesses get to “know their customers better”. Every time a customer enters any business establish like a coffee shop/salon/bank, the person serving the customer is immediately given a bank of data including the customers personal details, preferences and purchase history. They claim its a way for businesses to become more “efficient” and serve customers better, but where does this leave any possibility of privacy for the average human being?

 

SenseTime: Viper Surveillance System

screenshot-2019-03-07-at-9-23-07-am

SenseTime is a Chinese company that focuses on AI-based facial recognition systems. It is currently the most highly valued entity of its kind in the world at a net worth of 3 BILLION dollars. It’s flagship product the Viper Surveillance System detects faces in crowded areas and is most used by the government. What is shocking is that the government uses this technology the most in provinces with dense Muslim populations to track “terrorist” activity. However, its claims for doing so are far different.

1_wfaswvx_6zpvgrc0iwr_1a

Government claims across the globe:

Most governments are employing facial recognition software for various reasons. Some claim it is to find missing children, others claim it is to prevent and stop human trafficking. However, the actual uses are far from the truth they project.

 

AI-Generated Human Faces

New Website Generates Fake Photos of People Using AI Technology

AI-assisted image editing is used in the creation of “deepfakes” (a portmanteau of “deep learning” and “fake”) which are high-quality superimpositions of faces onto bodies. Generative Adversarial Networks have also been used to generate high-quality human faces, which, using face tracking technology, can be made to seem to be speaking in real-time.

Video forensics can be used, or image metadata can be extracted and analyzed, to identify AI-generated faces and videos.

How does one evade these various entities?

Classifiers v/s Detectors:

bodysub3

One of the key differences in surveillance systems is that between classifiers and detectors. While classifiers work towards categorizing pre-determined objects and are commonly used in face surveillance systems such as Apple’s True Depth Face ID with 30,000 touch points to identify faces. Detectors have to locate and determine objects themselves, i.e. create their own bounding boxes and are used in areas like autonomous driving vehicles.

NSAF: Hyphen Labs

1_qz-bkdogvwzaxa8lvyogeq

Hyphen Labs is a multidisciplinary lab which focuses on using technological tools to empower women of colour. They use human-centred design and speculative design methodologies in the aid of prototyping technologies. They have developed a concept called Neurospeculative Afrofuturism which integrates computational technologies, virtual reality and neuroscience to aid in the design of prototypes. HyperFace is a prototype which uses many faces drawn onto a scarf to misdirect the use of computer vision in data collection and profiling. It uses the data points used by tracking software to graphically design a scarf which has many of these points. It also uses certain colours which are not recognized by this software.

 

Glasses that confuse surveillance:

Researchers at Carnegie Melon University have devised a pair of glasses that “perturb” or confuse facial recognition systems

1_hvsqkmhmsxworomdk36pwg

 

Facial Camouflage that disturbs surveillance:

A team of researchers at Standford U led by Dr. Jiajun Lu have devised facial camouflage patterns to confuse cameras. This pattern renders the face unidentifiable from various angles, distances, lighting and so sob. They are experimenting with “living tattoos” for the face to create long term solutions to fight surveillance.

1_izwnp7o8ojv1ldnuu1v5ia

 

NIR LED Glasses, Caps or Burqa:

goggles

A low cost and feasible way to avoid any facial surveillance system is using Near Infrared LED lights. The lights are practically invisible to the naked human eye and when designed well in a prototype they can go unnoticed. The lights successfully blind cameras. The first prototype was a pair of eyeglasses designed by professors Isao Echizen and Seiichi Gohshi of Kogakuin University, and since then various prototypes ranging from caps to burqas have been made. The lights are inexpensive and available on Sparkfun.

 

URME Mask:

selvaggio1

THe URME mask is a 400$ mask sold at cost by its founder to help people evade surveillance. When worn, it is extremely realistic and the only time a wearer can be detected is when the lack of lip movement is noticed.

Facial Weaponization suite:

screenshot-2019-03-08-at-6-35-29-pm

Facial Weaponization is a series of modelled masks created in revolt to the political spectrum of facial surveillance. The masks are made in workshops using aggregated data from participants that are unrecognizable by biometric facial surveillance systems.

Concepts

amreen

In addition to exploring existing forms of facial countermeasures (like CV Dazzle) we considered utilizing the technology against itself. We imagined a digital mask that superimposed itself over any image recognized as a face taken by a device it was installed on, scrambling it and rendering it useless for facial data. We also considered bio-powered Near-Infared LED stickers that could be placed subtly on a face, and powered by body electricity.

COMPUTER VISION : LOW & MEDIUM LEVEL CV

Group: Erman & Jing

Web: https://webspace.ocad.ca/~3173625/wush/

  • Strategy:  

Project 1: Birthday Filter.

March 6th is one of my best friends birthday. She lives in China. I always want to come up with a way to celebrate the unbreakable friendship bond with a very special gift for friends. She lives in China. I thought the best and memorable gift to give her is a birthday filter I made. We used p5 javascript and HTML to build a computation visualization experience based on Kyle McDonald’s CV examples.

screen-shot-2019-03-07-at-11-02-51-am

Video: https://youtu.be/nfZkZgaMI0o

Project 2: Mixing Face

Try mixing traditional art skills with your digital painting process for unique-looking imagery, says illustrator Jean-Sébastien Rossbach. Mixing Face is a filter mixing your face and traditional art pieces build with HTML, p5 javascript and ml5 javascript. It will change your skin tone using the colour and shape in that painting. What paint to use is a matter of personal preference and style. Mix your face with art pieces you choose and you don’t need to use photoshop this time.

screen-shot-2019-03-07-at-11-05-09-am

Video: https://youtu.be/BKlFc3KGjzw

Software and libraries:

  • Text Editor
  • Download p5 javascript libraries.
  • Download ml5 javascript libraries.
  • Cyberduck
  • Documentation:

Project one is built based on Kyle McDonald’s CV examples

Experience 1: Kyle McDonald’s CV examples

We played with a collection of interactive examples using p5.js through the link(CV examples) Kate gave us. The examples are meant to serve as an introduction to CV and the libraries we can use. The examples in this link use p5.js to access live video. All examples are self-contained and can be run independently, so we tried all the examples and tried to learn the p5.js code.

The example I liked most is nose theremins and light painters that used our body as a pointer in p5.js. One key feature of this experiment allows people to use their body parts as pointers, instead of the mouse.

screen-shot-2019-03-07-at-11-06-14-am

(experiences of trying example code online)

Beyond the example code, I made a few changes:

To change the amplification

input.amplification = 2;

To track other body parts:

Change the code “input.part = ‘nose’;” to other part of body you want to track:

screen-shot-2019-03-07-at-11-06-55-am

(syntax for input.part)

The Creatability experiments include several musical instruments. Having multiple interaction modes can make creative coding projects more expressive and engaging.

screen-shot-2019-03-07-at-11-07-33-am

(experience with creatiability musical instruments)

Instead of having body posts as input only, I want to have some output for the overall experiment.

(Things need to use when building this online project)

Then we found Tensorflow.js and Tone.js is beyond our capability that we couldn’t find example code for triggering music online. We decided to go back to our original idea of birthday filter.

We used photoshop to create images we need for the filter. We downloaded 3 celebrities my friend loves and wrote some words.

screen-shot-2019-03-07-at-11-08-32-am

(Filter image1: birthday hat)

screen-shot-2019-03-07-at-11-09-13-am

(Filter image2: background)

screen-shot-2019-03-07-at-11-09-42-am

(Filter image3: boy1)

screen-shot-2019-03-07-at-11-11-10-am

(Filter image4: boy2)

screen-shot-2019-03-07-at-11-11-37-am

(Filter image4: boy3)

I also added a birthday song in p5 javascript.

Experience 2 with Processing

We have found two interesting codes. One of them is Daniell Shifman’s motion detection. Other is Abhinav Kumar’s colorDrawing. They both work with Processing.

These are the codes we used: ColorDrawing and Motion Detection.

Motion Detection: This application detects the motion in the camera. Motion appears in white colour and turns to black when motion stops. A created object follows the motion. After seeing this application we decided to that if make some changes and make it leave a track behind we can draw on the screen with our motions.
screen-shot-2019-03-07-at-11-12-20-am

We could make a few changes in the code, like changing the colour, shape, speed of the object.

ColorDrawing: This application was basically had the feature of what we could not make with Motion Detection app. After selecting a colour by clicking on it it starts making lines with colour and follows the same coloured images in the view. If you click on another colour it starts colour with that colour and keeps the previous line the same. It was hard to draw or write a synchronized camera because switching sides, but with some experience, it could be succeeded.  

screen-shot-2019-03-07-at-11-13-05-am

Video: https://www.youtube.com/watch?v=HO1x2gTZDRA&feature=youtu.be

We made a few changes in the code. It was easy to change the size and shape of the tracing object.

We also tried to combine two of the code and customize the motion tracking app first. What we wanted was colouring with motion. We focused on motion detection and tried to modify its codes; however, codes did not match and gave an error for each attempt.

screen-shot-2019-03-07-at-11-13-42-am

Image. One of our trails and errors. A red dot appears and does not move with motion. You can see its code here.

 

  • Insights:  

I imagined this tool could be used also for video calling. As we use emojis in our chats, we can create instant and live emojis while we are using our camera. We can combine features of the codes we found. When we use the camera, our creation can follow our body parts and can appear when other people or another object appears. You can create a mask or a make-up on your face and can keep it while you are seen on camera. Digital game design is also a possibility. There are many possibilities for CV for colour, motion, face tracking; however, lack of experience and knowledge with coding was a drawback.   

Experience 1: Kyle McDonald’s CV examples_Nose theremins

  1. ml5.js does not depend on p5.js and you may also use it with other libraries.
  2. If you need to run the examples offline you must download the p5.js library and ml5 library or any other library you need.
  3. Attach for the library you are using in html file. For example, the url for the ml5.js library to copy into an index.html file is:<script src=”https://unpkg.com/ml5@0.1.3/dist/ml5.min.js” type=”text/javascript”></script>
  4. PoseNet on TensorFlow.js runs in the browser, no pose data ever leaves a user’s computer.
  5. PoseNet can be used to estimate either a single pose or multiple poses, meaning there is a version of the algorithm that can detect only one person in an image/video and one version that can detect multiple persons in an image/video.
  • Information sources:

https://medium.com/@luisa.ph/nose-theremins-and-light-painters-eb8731957827

https://github.com/black/Drawing-on-Live-Video/blob/master/colordrawing.pde

Next Steps:

It would be nice to make an app or web page which people can draw pics. Their instant images can be used and they can just draw with their hand or body motions. The objects around them can be their colour palette. Saving the images and stopping and activating the brush would be necessary. Different filters could give different art outcomes and create different experiences. With some practice, painting with motion, image filters and additional images could be fun for video use.

 

 

Computer Vision & Graphics Explorations

Exploring PoseNet & ML5

header

Workshop insights:

During the workshop I was part of a group that explored PoseNet which allows for real-time human pose estimation in the browser using the tensorflow.js library. Read more about it here . We were able to test PoseNet in the demo browser and during explorations I noticed that the program would slow down when using their multiple pose capture feature. Additionally, I noticed that the skeleton drawn was pretty accurate regardless of how form fitting or loose one’s clothing was. At the time we were not able to test the effect of different colors of clothing as coincidentally all four of us had worn varying shades of gray. We attempted to download the Github repository found here however we had a lot of trouble running the code; A lot of dependencies and setup is required, that we didn’t quite understand.

When I couldn’t get the demo working locally on my laptop I tried following the Coding Train Hour of Code tutorial on using PoseNet that is available here. In the tutorial Daniel Shiffman uses ml5.js and p5.js – ml5.js is a tensorflow.js wrapper that makes the PoseNet and tensorflow.js more accessible for intermediaries or people who haven’t had much experience with tensorflow.js. The tutorial is however not suitable for people who haven’t used p5.js before although in the video, Shiffman links to other videos for complete beginners.

Insights from the tutorial:

In this tutorial I learned:

What is ml5.js? A wrapper for tensorflow.js that makes machine learning more approachable for creative coders and artists. It is built on top of tensorflow.js, accessed in the browser, and requires no dependencies installed apart from regular p5.js libraries. Learn more here

NOTE: To use ml5.js you need to be running a local server. If you don’t have a localhost setup you can test your code in the p5.js web browser – you’ll need to create an account.

You can create your own Instagram like filters! The aim of the tutorial was to create a clown nose effect where a red nose would follow your nose on screen. In theory, once you master this tutorial you can create different effects like adding a pair of sunglasses, or other effects. I learned about p5.js filter() effect which adds a filter to an image or video. I tested out THRESHOLD, which converts the image to black or white pixels if they are below a certain threshold, and GRAY, which adds a greyscale to the video. usage is filter(THRESHOLD) or filter(GRAY);

Pros & Cons of using a pre-trained model vs. a custom model? When using a pre-trained model like tensorflow.js a lot of the work has already been done for you. Creating a custom model is beneficial only if you are looking to capture a particular pose e.g. If you want to train the machine on your own body but in order to do this you will need tons of data. Think 1000s or even hundred of thousands of images, or 3D motion capture to get it right. You could crowdsource the images however you have to think of issues of copyright and your own bias of who is in the images and where they are in the world. It is imperative to be ethical in your thinking and choices.

Another issue to keep in mind is diversity of your source images as this may cause problems down the line when it comes to recognizing different genders or races. Pre-trained models too are not infallible and is recommended that you test out models before you commit to them.

What are keypoints? These are 17 datapoints that PoseNet returns and they reference different locations in the body/skeleton of a pose. They are returned in an array where the indices 0 to 16 reference a particular part of the body as shown below:

Id Part
0 nose
1 leftEye
2 rightEye
3 leftEar
4 rightEar
5 leftShoulder
6 rightShoulder
7 leftElbow
8 rightElbow
9 leftWrist
10 rightWrist
11 leftHip
12 rightHip
13 leftKnee
14 rightKnee
15 leftAnkle
16 rightAnkle

In the array additional information for the pose such as the certainty percentage and x,y co-ordinate of the keypoints are returned. These keypoints are important as they are how you will determine where to generate your filter  or effects e.g. clown nose.

keypoints

source: TensorFlow here

keypoints_m

Some keypoint readings and accuracy recorded from the motion capture of the image above of me sitting down were. These results are printed to the console and are shown here with the array expanded: 0.99 “leftEye”, 0.84 “rightEye”, 0.97 “leftEar”, 0.41 “rightEar”, 0.01 “leftShoulder”, 0.00 “rightShoulder” … 0.02 “leftHip”.

Once I determined that ml5 was working correctly. I drew the clown nose – a red ellipse drawn at the x and y co-ordinates of my nose. To do this I used the keypoint data at index 0 of the array which corresponds to nose info.  To access this data I first needed to access the 0 index of the poses array which holds all the detected poses. This will give me latest pose. Once I have the latest pose, I used the following to update a global variable noseX and noseY e.g.

noseX = poses[0].pose.keypoints[0].position.x

noseY = poses[0].pose.keypoints[0].position.y

The result:

rednose_a

The nose following crashes when you go off screen! You need to use an if-function to detect whenever at least one pose has been found, otherwise the nose will remain stuck at the last part you were on screen

rednose_a4

rednose_fix

The red nose is too bouncy! I noticed that the red nose was a little jumpy as it moved from position to position. To fix this, I used the lerp function to smooth the values so that the nose doesn’t jump immediately to a new positions. The value to use in the lerp function depends on what looks good to you. Tried 0.2 at first but this was too choppy, so I upped it to 0.5. Since I knew how to detect the nose, I attempted to add an additional keypoint tracking and tracked my left-eye which is at keypoint 1 index.

rednose_lerp

Red nose is out of proportion! I learned that the distance between keypoints is bigger when you are closer to the camera and smaller when you are further away which caused the nose to be really big when far away and really small when closer. In order to fix this I needed to estimate the camera distance and draw the nose proportional to the distance between my eye and my nose keypoints. This corrects the proportions so that up close, the nose is big and far away it shrinks in size.

rednose_c

Proportions are off

rednose_proportion

rednose_b

Fixed proportions

It is possible to continue adding effects e.g. I could create sunglasses or a hat to go with my red nose. I however did not like this approach, as it works best only for selfies and not full body poses because there are too many keypoints to keep track off when attempting to create a unique effect at each point, especially with the addition of lerping. To create an effect where there is no keypoint e.g. there is no keypoint for the top of your head but you can use the position of the right and left eye to determine where a hat should go.

Video Classification Example

I was toying around with the idea of having the algorithm detect an image in a video and explored for a video classification. It quickly dawned on me that this was a case for a custom model as the pre-trained model seemed to only work best when generic objects were in view. e.g. At times it recognized my face as a basketball, my hand as a band-aid, my hair as an abaya etc. I also noticed that if I brought the objects closer to the screen, the detection was slightly better. Below are some of my findings using MobileNet Video Classification in p5.

mnet

Ideation & Exploring PoseNet with Webcam:

I wanted to leverage the power of PoseNet to track poses in music videos but also subvert its usage to create a trivia game that I called Name That Singer. The idea was to create a video that showed only the pose skeletons dancing and a viewer would have to guess who the singer was based on the pose on the screen. I chose a viral video – Beyonce’s Single Ladies – that I assumed would be easy to figure out. I didn’t take into account how fast they dancers in the videos were moving and this made it hard to determine which song was playing when the skeletons were showing on the screen.

For this part, I decided not to use the lerping function to create a unique effect and instead used the pre-determined functions in ml5.js for PoseNEt Webcam to capture the skeleton. These pre-determined functions were beneficial in this case as my points and skeletons are identical in aesthetic so I was able to cut down on coding needed. I followed the tutorial here and instead of using webcam I loaded my own videos.

Below are some screenshots from my testing. I also tested the poses when filters such as threshold, invert, and blur were added to the video and found that the tracking was really good. Even with cartoons.

findthatsinger

Artists/Creative Coding Projects:

Chris Sugrue – She is an artist and programmer working across the fields of interactive installations, audio-visual performances, and experimental interfaces. website

chrissugrue

source: Chris Sugrue

Delicate Boundaries – Light bugs crawls off a computer screen onto human bodies as people touch the computer screen, exploring how our bodies interact with the virtual world if the world in our digital devices could move into our physical world.

I liked this project [Delicate Boundaries] because it explores beyond the computer screen, it could be cool to do something like this with PoseNet where instead of just mapping onto the screen, poses can be mapped onto the body.

References:

Real-Time Human Pose Estimation in the Browser with TensorFlow.js – here

PoseNet with webcam in ML5.js – here

Github code: here

Expressive Haptic Throw Blanket

Haptics Workshop

Experiment 1: Hello Vibe Motors

First, I tested LRA vibrating motor with Arduino sample “Blink” code. I felt the motor vibrating on my fingertips and the back of my hand. I also tried it on my arm and on the table under my arm to feel the different skin sensing.  I tried it also on my neck and face. I noticed that there are significant differences in the felt frequency when the motor is mounted on glabrous versus hairy skin. I tried with altering the pace of vibration with long and short delays, and I noticed that the vibration pattern was more recognizable with the slower pace. With short delays, it felt as if the vibration is continuous. Then, I tried testing the LRA vibrating motor with the “Fade” example code. I explored creating a different vibration pattern by modifying the code, so that the motor vibrates on even number counts of its intensity and it’s off on odd number counts while it was fading in and out. I tried it on my fingertips and the back of my hand. it wasn’t easy to notice the intensity changes of vibrations. it required a lot of concentration to be able to recognize the fading in and out pattern.

img_7994

Experiment 2: Motor Arrays

For this experiment, I used multiple vibration motors in order to experiment with sensations that travel on the skin and the propagation of surface waves between a number of motors. I used 2 motors for this experiments and modified the “fade” example to activate them in sequence. I tried placing them on different locations on my fingers, hands and arms to see if I could sense any haptic illusions. Surprisingly, I was able to feel a vibrating line between two vibrators placed on my pinky finger.  I tried to make the two motors to vibrate interchangeably with the modified “fade” code; one that fades with odd number counts, and the other to vibrate with even number counts. It slightly felt like a line was going back and forth between the two vibrating motors placed on my pinky finger.

img_7995

Experiment 3: Haptic Motor Drivers

For this experiment, we used Adafruit’s haptic motor drivers. I downloaded the Arduino library and tried the “basic” example code that goes through all the driver’s different vibration patterns. I tried also the “complex” example code and played around with different vibration effects to achieve an interesting sequence. I tried randomly many combinations but for me none of the sequences made any specific sense or gave an interesting outcome; they were just trials for different haptic effects.

img_7999

img_7997

 

Haptic Feedback Design Application: Expressive Haptic Throw Blanket

Strategy:

For this workshop, I was interested in Surround Haptics that offer immersive gaming and theater movie experiences.

I propose designing an expressive throw blanket that provides immersive embodied experience in home theater environment through vibro-tactile sensations on the entire body. This haptic blanket is designed to provide smooth tactile motions to intensify emotions and enhance viewers’ movie experiences. The device is meant to be wearable and portable.

img_8109

I planned to create moving tactile strokes by embedding multiple vibrators on a cozy, flexible, soft, lightweight throw or blanket that is easy to put on, sit on, or wrap your body with, and it’s big enough to fit various body sizes. The vibrators will be equally spaced and arranged in a matrix configuration so that when the blanket is wrapped around the full body, the actuators will be in contact with the full body: the shoulder, back, hip, thigh, knee, shin, back of the legs, upper arm, lower arm, palms of the hand, and stomach. Obviously, vibrations on cloth covered body areas will be less noticeable, so the frequency and power of vibrations should be somehow strong enough to be sensed. The goal is to create illusion of tactile sensation in order to make the movie viewer feel immersed by haptic sense. The haptic effects should flood the user’s entire body.

img_8107

img_8096

Flexible haptic throw blanket can be sat on or wrapped with while watching a movie for an enhanced immersive emotional experience.

img_8089

img_8093

 

Coding and Testing Materials:

Arduino Uno

SparkFun KY-038 Sound Sensor

LRA vibrating motor

Adafruit haptic motor driver

5mm LED – Red

5mm LED – Yellow

5mm LED – Blue

3x 10K ohm resistors

Wires

Breadboard

 

Circuit Schema
img_8105

img_8099

The haptic sensation for this project will be based on the Adafruit haptic motor driver effects, meaning that the vibration pattern is identified by the volume and intensity of perceived sound. Using Arduino code, I programed the device to be synchronized with spatial sounds using a sound sensor. The haptic device will respond to 3 different volume thresholds, and for each threshold a different vibration effect is actuated. For low volume sounds the vibration pattern will be smooth and short, for medium volume sounds the vibration pattern is moderate with medium intensity, and for higher volumes the vibration will get longer and more intense. I also included 3 different colors of LED lights as a visual representation for testing purposes.

screen-shot-2019-03-06-at-6-27-17-pm screen-shot-2019-03-06-at-6-27-36-pm

 

Insights:

It’s my first time to work with Sound detection sensors, and they can be very useful for many ambient environmental designs where sound, lighting and haptics can be all synchronized in real time. I found it challenging to create detailed tactile sensations that reflects movie events. For this project, I used general vibrations or pulses predefined in the Adafruit Motor Driver, however, the haptic feedback pattern is not synchronized with a specific movie narrative or its peak moments.

This prototype can be applied for many other examples other than immersive theatre experiences. It can be interesting if it’s used when listening to music while relaxing on a couch or bed, where you can feel the tones of musical instruments moving along your body. Also it can be very useful for deaf people where they can be notified of door knocking or any other alarming sounds that might occur.

Next Steps:

For the future, to design a more sophisticated haptic throw blanket, we can make the haptic effects to feel more realistic by recording tactile signals from the real environment and assigning them to prerecorded sound effects. Then, sound effects can be mapped corresponding to each part of the body so the user feels as if the sounds run through the body. These ideas require more skills and further research in order to be achievable.

References:

https://learn.adafruit.com/adafruit-drv2605-haptic-controller-breakout?view=all (Links to an external site.)Links to an external site.

https://create.arduino.cc/projecthub/Heathen_Hacks-v2/sound-sensor-activated-leds-with-lcd-for-sound-level-data-e7f5d2?ref=search&ref_id=KY-038%20Sound%20Sensor%20using%20Arduino&offset=0

Israr, Ali, Seung-Chan Kim, Jan Stec, and Ivan Poupyrev. “Surround Haptics: Tactile Feedback for Immersive Gaming Experiences.”ACM, 2012. doi:10.1145/2212776.2212392.

 

 

 

 

 

Monstrous Anonymity

  • Strategy:

This week the goal was to take a step back and work not with technology, but against it (obviously still with, but, you know). My background in photography has always made my slightly fascinated by facial recognition. Particularly since the liquify tool on Photoshop started to incorporate facial recognition in order to streamline beauty retouching. This, as a tool, generally promotes problematic and harmful ideas around beauty, but can also be used to create somewhat monstrous manipulations of the face. I wanted to explore where this technology starts to break down – when does the manipulation stop registering as a face? When does my face stop being my face to something like google? The ultimate goal here is to make a photoshop action that would take a normal photo of a face and make it not recognizable as human or not identifiable as a specific person.

  • Documentation:

I started off by turning on the facial recognition in my google photos. It took the better part of a day to scrub through all of my photos (26,260), but once set up google auto configures as series of albums that are groupings of the same face. The first photos google has access to of me is from 2010, which as an interesting side note is 3-4 years before I started transitioning, but you can’t fool google!

Or can you???

I have been told in passing the the forehead and general symmetry of the face are the things to manipulate to try and confuse google computer vision so I made a few different liquify presets and then then reapplied it to the same photo and uploaded the results until google stopped recognizing the face as me. I liked the idea of using the facial recognition of photoshop to confuse the facial recognition of google so I wanted to keep the parameters to the parts of the face that can be directly targetting by photoshop.

(as an aside, the rediscovery of many photos from the last 10 years was not always pleasant, so I wouldn’t necessary recommend it as something to go into in and unconsidered way if you’re a person who may experience *feelings*)

 

Starting Photo!

img_20190306_115817

Experiment ONE:

run Once (still me)

onceimg_20190306_115817

Run Twice  (STILL ME)

twiceimg_20190306_115817

Run Three Times (NOT ME)

thriceimg_20190306_115817

Experiment TWO

It took 5 rounds of this effect to get to a place where google wouldn’t see me.

twoone

twotwo

twothree

twofour

twofive

NOT ME:

twosix

A disturbing Gif:

https://photos.app.goo.gl/RdWcDvu3QyjP82oa6

Experiment THREE

threeone

threetwo

threethree

threefour

threefive

ALL OF THEM REGISTERED AND THAT’S JUST BANANAS.

Another Gif:

https://photos.app.goo.gl/6W5bXdGCWK4Rd84f8

Here are the photoshop actions for people to play with themselves!

https://drive.google.com/file/d/1LJcp3wPgzZgdbVS5H-xr_s01EO56TgVu/view?usp=sharing

  • Insights:

This experiment felt less insightful and more inspiring of more questions. Im curious as to what the results would be if I were to play with colour editing, or noise, or transparencies. It was surprising difficult, or rather, the warping effects felt as though they need to be quite extreme in order to be effective, which was unexpected. It feels as though there should be more errors if the parameters for what registers as my face is so broad, like other people should be getting caught in that net, but they are not. This is part of why I am curious about the manipulation of colour or noise in a photograph for potential further tests. When I was sharing the results with some friends one of them mentions that the eyebrow bridge between your eyes is very crucial in how our faces get read, but that part of the face is not targetable by the liquify panel so would be much harder to incorporate into an action.

  • Information sources:

Photoshop’s 2015.5’s new Face Aware Liquify for Portrait Retouching – https://www.youtube.com/watch?v=vyBGGuJhESU

  • Next Steps:

It would be nice to make a site where people upload photos of themselves and get returned a series of results of their unidentifiable. Or maybe even simpler, just a gallery to upload other peoples photos two after they have run the actions in photoshop. I envisioned this primarily as a weird little art project, so it would be interesting to display the results together. If I was going to make it much larger I would try to incorporate some of the colour testing to get more interesting photo outputs and would be interested to do more precise testing to see if i can discover more information about exactly where the line for identifiability is.

Brief Overview of Artificial Neural Network Systems

Last week I gave myself a crash course on Neural Networks and Machine Learning in an attempt to disseminate the concepts behind the core algorithms and processes behind artificial neural networks, machine learning and agents of artificial intelligence. The topics are often discussed from a mathematical perspective; I wanted to review a set of readings and make my own interpretation of what I found in terms of programming terminology for the sake of applying the concepts in my research.

Artificial Neural Networks (ANN) are assemblies of artificial neurons, sometimes called units, which are not designed to imitate actual neurons but modeled after biological findings of how the brain works. We are not quite at a stage of scientific discovery where we can say with certainty how neurons work individually or within a network in the brain but scientists have nonetheless created digital models of their theories of how neurons communicate. These ideas have carried over to computer science as inspiration for how to develop and model algorithms that seemingly have the capacity to perceive, learn, and make decisions based off of sensory data.

neuronAbove: a model of the neuron (Cichocki)

There are many artificial models of the neuron, but the general idea is that neurons take in a numerical array of information (often binary) and pass the array through a weighted threshold  in order to determine an output value. The output is also represented numerically, often as a value between 0 and 1 or -1 and 1, either in binary and/or analog format. The output is dependent on the weighted value threshold, which determines how ‘confident’ the neuron is that the input passes or fails its threshold (e.g. if at least 60% of the inputs are 1’s, output a 1 signal). In some models, the output can be fed back to a weighting algorithm within the neuron to determine whether the weighting threshold should be modified and to what degree.

nn
Above: ANN arrangements: a) FeedForward b) FeedBack c) cellular arrangement (Cichocki)

Neurons can be arranged in a number of ways within an Artificial Neural Network. Their outputs are fed to the inputs of other neurons which can arrangements such as the examples above.  FeedForward is essentially a chain of neurons, while FeedBack incorporates neurons dedicated to returning incoming information back into the chain. (Artificial) cellular arrangements involve neurons with multiple, nonlinear connections with each other.

perceptron

In many neural networks, these chains occur in layers such as described in the feedforward multi-layer Perceptron model above, one of many models used for designing neural networks. Multiple neurons with very similar input domains and tasks are arranged to communicate with neighboring layers but not neurons on the same layer. These networks can have multiple layers, but apparently run most effectively with three layers for most purposes. The layers themselves do not have to have the same number of neurons as any of the other layers. This arrangement of neurons allows the Perceptron to analyze data chunks instead of individually at a ‘pixel’ scale.

For example, in an image recognition process a specialized neural network might be tasked to analyze the direct capture of data from a camera in one layer, (possibly split across red, green, and blue channels), determine small clusters where brightness values form discrete lines or patterns in a second layer, then determine if these clusters form a particular shape in a third, then finally feed this information to a single neuron to determine how closeley the collective analysis from the third layer matches a ‘learned’ model for a cat (many machine learning algorithms undergo a ‘training process’ to prepare a neural network model to recognize certain features; those that do not are considered ‘unsupervised machine learning algorithms’).

In essence, ANN’s use the collective processing power of many smaller units in order to solve higher level problems. The fact that these machine learning algorithms are being brought up in our Body-Centric class is rather eye-opening with regards to how such technologies might be integrated with bodily technologies. A friend of mine referred me to an example of neural networks being integrated with prosthetic, where the prosthetic limb interpreted incoming electrical signals and gradually learned to perform actions reflective of its user’s intentions. If I remember correctly, I believe he was describing imitative machine learning, where the machine tries to imitate particular ‘memories’ that are implanted in it during its training process; perhaps such a device could imitate human hand motions. I wonder if and how some of these concepts could be carried forward to some of the sensory technologies we experimented with in the past, including vision, the EMG sensor, etc.

Sources:

Cichocki, Andrezej. “Neural Networks for Optimization and Signal Processing” Chchester, NY: J. Wiley, 1993. Print.

Buduma, Nikhil. “Fundamentals of Deep Learning” Sebastopol, CA: O’Reilly Media, Inc., 2017. Print.

Castano, Arnaldo Perez. “Practical Artificial Intelligence”. Apress, 2018. Print.

SilentMetronome

Context

As someone who experiments with musical instruments regularly, understanding tempo and rhythm are essential to playing any kind of music regardless of how simple. In big bands, because there are many musicians playing together, synchronization is essential and it is usually maintained using headphones for each member or a maestro to keep everyone on-tempo. Besides being unusable to people with some disabilities, headphones can be annoying and can cause unnecessary sweating around the ears. Following that logic, I wanted to create a tool that I can use during a jam session to replace the headphones, a tool that allows me to hear and interact with my surroundings while keeping the tempo.

Vibration motors
Vibration motors

Objective

This experiment is an attempt at using Arduino and vibration motors to create a wearable metronome. This metronome will use vibration instead of audio to notify the user of each beat, making it a more accessible tool to people with hearing disabilities on one hand, and others who have a preference of not wearing headphones. The goal is to create a p5 interface that gives the user the control of the tempo, but this version of the metronome only allows modification through hard-coding the Arduino board. So, it is important to note that this version produces one tempo at a time, but the intention is to continue expanding and aiming for full interactivity without the need to edit the Arduino code.

Link to Video

Design Process

An idea of what the device could look like
An idea of what the device could look like
Initial sketch
Initial sketch

Class Experiments
Through the three experiments, we were asked to take note of the findings and test different methods of connecting the circuit and of coding the motors. This was a very useful step because I was able to learn at an early stage what type of connection I am interested in having depending on the purpose. For example, connecting the motor to a transistor made the vibrations much stronger but required a more complicated circuit. On the other hand, using a motor driver offered aq library of vibration patterns but the intensity of the vibration decreased significantly. The second experiment offered me what I was looking for to build the metronome because it required a very simple reliable connection, allowed multiple motors, strong enough vibration and enough flexibility to edit the vibration patterns.

First attempt using one motor
First attempt using one motor
Experimenting with two motors using simple connection
Experimenting with two motors using simple connection
Experimenting with the motor driver
Experimenting with the motor driver

Body Placement
Through my research for similar tools online, I came across the Soundbrenner Pulse, which is a smart wearable metronome that the user wears on his or her wrist, like a smartwatch. I also was considering the wrist as a placement for the metronome based on Clint Zeagler’s diagrams, suggesting that the wrist is a high sensitivity area. However, while the wrist metronome can work well for some instruments, percussions instruments, on the other hand, might not work well with a wrist metronome. So, after several trials in placing the motors on different parts of the skin, I eventually decided to place them on the back of the neck. This way, it will ensure there are no cables in the way, the vibration is not hindered by the instrument itself. In addition to that,  as part of my tattoo education, I know that the epidermal skin thickness of the neck is thin compared to the rest of the body. Placing the motors on the back of the neck brings the whole body into the musical experience because of the vibration that is pulsating into the body through the skin, which is more intense than placing the motors on the arms.

Testing the placement of the vibration motors
Testing the placement of the vibration motors

Circuit & Code
Going through the class experiments allowed me to see different approaches for the circuit. Considering that more than one vibrating motor is required to build this metronome, I wanted a simple circuit that is reliable and is not hindered the number of motors connected to the board. It is important at this level to write a code that is scalable because this is only the first version and future iterations will have more motors and more options available. Eventually, I used a simple connection, using the one side of the motors to two digital pins and the other to the ground pins.

Circuit
Circuit
Snippet of Arduino code
Snippet of Arduino code

Link to code

The code I used for this project is based on the fade example from the Arduino library. However, after trying different settings and values, I noticed that vibration with a (50) delay was enough to simulate a percussion tap. In addition to that, for this version, I decided to use a 75 bpm (beats per minute), which is a mid-range tempo. But since I am working with the “delay” function, I have to calculate the gap between the beats rather than the speed of the beats. So, through some calculations: 75 bpm will require an 800 milliseconds delay.

 

Tools & Materials Used

Arduino Mega
2x vibration motors

 

Challenges

The main challenge I faced in this experiment was figuring out how to code the gaps between the beats rather than the beats per minute measurement. It got really complicated when I added another motor and attempted to output different patterns on each at the same time. However, since I am using the delay function, it is causing some issues and I will be working on it more to straighten out those issues. 

In addition to that, one of the main challenges during this assignment was writing the p5 code. Although it is something we have done before, after numerous attempts to make it work, for some reason, it is still not sending any information through the serial port. I will continue working on that.

 

Future Steps

In order for this tool to be fully useful, it requires an equation that converts tempo values on the front-end to delay in milliseconds on the back-end. This is crucial since the tempo has to be changeable by the user easily. Also, I will explore using more than two motors for the next iterations and testing how the experience can change to the user. In addition to that, I will explore adding some more functionality into the metronome, like offering the option of splitting the beat into 3 or 4 sub-beats.