Project 02 – Rivers

Introduction

Rivers is a media art installation piece that combines video, interactivity, tactile sensing and virtual reality audio. Rivers functions both as a documentary and creative experience that explores the rivers and ravines of Toronto through their soundscapes and video images. This experience is interactively augmented by the physical sensation of touching its water covered surface and producing meditative visual and sonic reactions as a response.

Rivers appears as a wooden box with a water covered acrylic top surface and a pair of headphones. The acrylic surface displays video images taken from rivers and ravines in the Toronto area while 3D audio soundscapes, captured in the same locations, can be simultaneously heard through headphones retrofitted with a head-tracking sensor (Ambisonic and Binaural processing). The touch interactions produce changes on the video images that simulate the effect of affecting small “virtual” white particles floating on the river and simultaneously generating sounds that change according to the movements of the hand and number of fingers placed on the acrylic surface. The soundscapes and the touch-generated sounds are delivered using 3D sound technology to create the aural sensation of physically being at the river location and of having the generated sounds move around the listener. The generated sounds also move in this virtual space following the same direction as the touch gestures.

The videos and the audiovisual interactivity changes approximately every two minutes switching to different “scenes” representing different locations along the rivers. On each of these scenes the produced sounds and the physics of the visual interaction changes.

Rivers can be approached from different perspectives:

  • As a visual and tactile only piece where the headphones are not worn but the audience focuses on the visual interaction and tactile perception of the water.
  • As a virtual audio experience where the audience doesn’t look at or touches the piece but concentrates only on exploring the 3D audio soundscapes by facing different directions (eyes can be closed for increased effect).
  • As a sound art (musical) experience where the audience focuses on finding different ways of producing and controlling the sounds produced by touching on different areas of the surface and using a varying number of fingers.
  • As an integrated experience covering all of the above.

 

[click on images for full version]

20141110_214439 20141110_214432 IMG_5761 IMG_5766 IMG_5791

 

 

Context

This piece is heavily inspired by the Acoustic Ecology movement, soundscape based sound art and the sound field recording practice as manifested by the Phonography movement. These areas all share the commonality of looking to produce some level of  awareness about our connection with our surrounding environments (natural, social, geographical, architectonical), through the understanding and artistic use of sound, considering all of its implications as a physical phenomenon and as an aural perception experienced by all living organism. Acoustic Ecology has a scientific basis but a strong involvement of the artistic community probably due to having being founded by a group of music composers and sound artists lead by the Canadian composer and SFU University professor R. Murray Shafer.

Rivers became an interesting opportunity for combining sound art with a visual interactive interface where both the visual and the acoustic can exist independently or as a combination.

Related web resources

Acoustic Ecology:

Phonography:

 

System description and functionality

Hardware

The hardware I used in Rivers is all self-contained in a wooden box with an acrylic top cover that functions as rear projection surface and water container. it consists of:

  • 1 pico video projector
  • 1 Leap Motion sensor (IR camera)
  • 1 Mac Mini computer
  • 1 Pair of sealed-back headphones
  • 1 Arduino Pro Mini based head-tracking sensor with Bluetooth

An infrared web cam was required (as explained in the sections below). I used the Leap Motion since it was already accessible to me and because it contains two IR cameras and three IR LEDs for lighting. Only one of the cameras was used which means that possibly any other USB IR webcam could have been used instead; although some of the advantages of the Leap Motion’s cameras are: the small form factor, the wide angle lens that allows them to be placed in close proximity and the possibility of stereoscopic imaging and depth sensing (see experiments below but not used for the final project).

The orientation sensor is required to enhance the 3D virtual audio effect. The Arduino sends serial data via Bluetooth containing the yaw, pitch and roll orientation angles in degrees. These values are generated by the fusion of the data obtained from three sensors: a gyroscope, an accelerometer and a magnetometer. The components I used made it easy to assemble with the only challenge being finding a way of making it compact enough so it can be attached to the headphones headband. The components are:

In the near future I will design and 3D print a case for this module. The battery must be positioned with its top (where the cables connect to the cell) away from the magnetometer since this part seems to produce a strong magnetic field that interferes with it. More information about the firmware running in the Arduino can be found in the next section.

The headphones should preferable be ear enclosing, studio monitor quality with sealed-back drivers for increased sound isolation. The headphone jack is located on the front of the box and internally connected with the headphone output of the Mac Mini. The head-tracking module is attached to the middle of the headband using a velcro fastener.

IMG_5756 IMG_5779 IMG_5752 IMG_5751 IMG_5750 IMG_5749

 

Software

Source code available on GitHub

The software running on the Mac Mini was programmed using OpenFrameworks plus the following libraries:

I chose OpenFrameworks over other alternatives (like Processing or Max/MSP) due to my own personal interest in learning more about this particular platform and because of providing a better performance with the produced native binaries programmed in C++. I tested the performance of LiquidFun and OpenCV under both Processing and OpenFrameworks and the second showed a much more faster and fluid (no pun intended!) response and rendering. The LiquidFun engine worked great as a way of simulating the physical interaction of floating particles. This library is developed by Google as an extension of the well known Box2D physics engine.

I named the resulting control application with the code-name Suikinkutsua Japanese word used to name a particular type of garden fountain consisting of an underground resonation chamber that produces sound as drops of water fall into it. The logic of the application contains the following processes:

  1. Background Video and audio playback: For each scene, a video file is automatically played using an ofVideoPlayer instance. The corresponding soundscape is also played by sending a score event to the Csound instance. At the end of each video a new one is loaded and a new soundscape triggered.
  2. Virtual particles: An instance of the ofxBox2d object is filled with a ofxBox2dParticleSystem object containing 5000 white particles. The gravitational behaviour of this particles changes according to the scene.
  3. Video capturing: Using the Leap Motion API the black and white (infrared) frame image from one of the cameras is obtained via an instance of the Leap::Controller object.
  4. Image processing: The pixel information from the Leap camera is transferred to a ofxCvGrayscaleImage object and resized to remove the distortion (the images form the Leap are elongated on the x axis).
  5. Background removal: The ofxCvGrayscaleImage::absDiff() function is called on the image to remove the background by comparing it to an image captured right after the application is launched. The background can be recaptured at any moment by pressing the spacebar key to compensate for ambient light changes (a future development will automatize this).
  6. Brightness thresholding: The image is thresholded using ofxCvGrayscaleImage::threshold() to produce a binary image for OpenCV. The threshold level can also be adjusted during runtime by pressing the ‘=’ and ‘-‘ keys (a future development will save this and all settings on a preferences text file so they don’t need to be hardcoded).
  7. OpenCV contour and blob detection: An instance of ofxCvContourFinder is used to detect the blobs appearing in the processed image. The fingertips of the user will reflect the IR light from the Leap and appear as white circles in the thresholded image. This is also possible since the white acrylic  material allows the IR light to pass through. The light and images from the projector do not interfere with this process since they remain in the visible light segment of the spectrum and are not detected by the IR camera. The blob size limits can be set via the ‘a’ and ‘s’ keys for the maximum area and ‘z’ and ‘x’ keys for the minimum area size.
  8. Particle animation: The centroids of the detected blobs are mapped to the width and height of the projection and used to spawn invisible circular ofxBox2dCircle objects in the ofxBox2d world at the same locations where the fingertips are detected. The repulsion forces of these objects are set to make it appear as if the touch causes the particles to disperse.
  9. Head-tracking: The yaw, pitch and roll data received from the head-tracking module, via serial communication through an instance of the Razor object, is sent to Csound to be used in the sound processing. By only using the gyroscope and accelerometer sensor it would be possible to determine the orientation but since these sensors are not perfect (particularly the gyroscope) there is a need of a reference point to be used to correct for their error accumulation and drift. This is where the magnetometer is used and the magnetic north of the earth becomes the required reference. This also means that the zero degrees position (front) will always point towards the magnetic north. This can be useful for aligning the soundscapes to their actual geographical position but for the purpose of this project I required the front to be pointing towards the installation. To achieve this, I programmed the application to offset the orientation values by an amount equal to the current orientation of the sensor at the moment of pressing the “c” key, so this orientation becomes the zero degrees position in all axis (yaw, pitch and roll). This calibration is required right after starting the application and in the case of occasional drifts.
  10. Sound synthesis: The x position of the OpenCV blob centroids is sent to the Csound instance to be used in the synthesis of sounds and to virtually position them spatially in the vicinity of the same location (mapped to a -180 to 180 degrees frontal area). This spatial location is varied by a random fluctuation to give it a more organic sensation. The more blobs are detected the more sounds will be produced although due to performance limits the number of sounds is currently limited to 3. The sound synthesis is performed using time-stretching  and freezing of the spectral FFT analysis (Phase Vocoder) performed on three different short audio samples that I recorded previously: wind chimes, knocks on a wooden box and crickets, each one used on a different scene. When the user places a finger on the surface, its x position is mapped to the length of the audio sample so the sound contents at that time location are heard. If the user doesn’t move the finger then the sound is “frozen” which means that what is heard is only the sound contained at that moment in time.
  11. Spatial sound processing: The soundfields of the ambisonic soundscape recordings and the ambisonically panned syntesized sounds are mixed and rotated in the opposite direction of the angular orientation reported by the head-tracking sensor. By doing this, the sounds appear to the listener to remain in their positions and in this way enhancing the effect of sonic virtual reality. This process is similar to the one that is performed for visual VR (e.g. Oculus Rift) where the orientation of the viewer’s head is used to rotate the camera in the 3D scene. A binaural reverb is added to the synthesized sounds to enhance the sensation of being in a physical space using Csound’s hrtfreverb diffuse-field reverberator opcode. The diffuse-field is composed by the sound reflections or reverberation on walls or objects; these don’t contribute to sound localization but are used by the human brain to determine the size of a physical space. The final ambisonic mix is transcoded to binaural, using the virtual speaker process, so it can be delivered via headphones.

Besides the key commands described above, and for calibration purposes, the application can also display the raw and thresholded images captured by the Leap by pressing the “i” key. This is useful for checking if the background needs to be recaptured or if the presence of ambient IR light makes it necessary to change the filtering threshold. By pressing the “t” key the connection to the head-tracker is done, which is also useful in case of loosing connection. The

Audio and visual content production

The videos and soundscapes used in Rivers were collected by me using a Cannon T3i digital SLR camera and ambisonic sound recording equipment. The ambisonic microphone used is my own B-Format prototype constructed using 3D printed metallic and plastic parts and 10mm electret microphone capsules (one omnidirectional and three bi-directional). This microphone was not built for this project but it is part of a longer personal research process.

The sound recorded used is a modified Zoom H2 that had its four internal microphone capsules removed and a 5 pin XLR connector installed instead. The parts for the modification were designed by the sound engineer Umashankar Manthravadi and are available on Shapeways (he also designed parts to build A-Format ambisonic microphones). My 3D designs for the B-Format microphone will also be eventually available through the same supplier.

IMG_5780 IMG_5781 IMG_5782 20141109_115427

 

 

Development Process

Experiment 1: Video projection

This is a simple experiment where I wanted to find how a laser pico projector would interact with different surfaces. I knew that trying to project on clear water wouldn’t work with a light bulb based projector so I wanted to confirm that a laser one would also not be that effective on clear water.

From the experiences with my previous project I found that light through white acrylic would yield an interesting effect so I tested that too. I found that the projection on acrylic caused some light refraction that  made the image a bit blurry, event though the laser projectors are known to always remain in focus regardless of the distance of the projection surface. I found that this effect wasn’t detrimental to my objectives and instead contributed to a good visual effect.

 

Experiment 2: Leap Motion as “see-through” IR camera

I’ve know of other systems utilizing IR cameras to track objects or finger tips placed on a transparent surface. As an example, the ReacTIVision system used in the Reactable uses this combination to sense the objects placed on the surface and to function as a touch interface. I wanted to explore this possibility still now knowing how would it be applied to my project. I don’t posses an IR camera so I wanted to find if the IR cameras in the Leap Motion could be used for this purpose.

In order to sense touch on the surface of a semi-transparent white acrylic (as the one used in the previous experiment)  I found that an IR camera also worked very well due to not being interfered by the video projection. Since I knew that I wanted to combine projection and computer vision, this seemed to be a good way of avoiding confusing OpenCV with the images being projected. For this experiment I used Processing with the Leap Motion library by Darius Morawiec to get access to the feed of the IR cameras. I could have also used the TUIO framework (used in ReacTIVision) but I decided I wanted to implement my own simpler system.

The results of the experiment were successful with the exception of realizing that the intenisity of the IR LEDs on the Leap Motion are not user controllable through the API and that they self-adjust depending on the levels of ambient IR light. It seems that the reflection of the IR LEDs on the acrylic surface caused the  Leap to increase their brightness making the image filtering process imperfect due to some sections being burned out. A future experiment would be to determine if a non IR reflective acrylic surface could be used.

 

Experiment 3: Magnetometer response

One of my early ideas was to use the acrylic surface, covered with water (the same acrylic tray design I used for the final version) and use floating objects that could be placed on the water by the user. This objects would have magnets embedded in them and I would use one or more magnetometer sensors under the tray to sense the magnetic fields and use that data for creating the visuals projected from above. For doing this experiment I bought two magnetometer modules that I attached to an Arduino.

When coding the I2C communication I realized that the specific module I purchased didn’t allow to set a custom address so it would be hard to have more than one connected to the same I2C bus (the Arduino Uno has only one). As an alternative I could have used a bit-banging I2C library (like this one) for the second module but for this experiment I used only one. The purpose of the experiment was to find what kind of electromagnetic variations I would get with the magnet in close proximity and over the different axis of the magnetometer. I also verified how the changes in polarity would affect the values.

 

Experiment 4: Floating magnet and electromagnet interaction

At this point is where I had the idea of adding a sound component to my installation that would get generated by using the electromagnetic data from the magnetometers. Following the same idea of electromagnetic objects floating on water, the user would place the floating objects on the water tray and move them around to change the characteristics of the synthesized sounds. I then thought I could also make them move by themselves when nobody was interacting by using electromagnets placed under the tray. For controlling this electromagnets I would use a PNP transistor and an Arduino for switching a 12volt current going into the coils.

For this experiment I used the coils found inside two different relays I purchased form a surplus store (Active Surplus). One of the coils appeared to not produce a very strong electromagnetic field probably due to having a lower count of wire turns. A second larger one worked much better and it did provide some push to the floating wooden cubes with embedded electromagnets. These cubes floated quite nicely without the magnet but after drilling a hole in the centre to insert the magnet the volume of the cube wasn’t enough to make them remain on the surface. A second test was done with a light foamy material.

 

Experiment 5: Producing a depth map with the Leap Motion

After my floating magnets experiment I thought I could try to use the stereoscopic IR cameras on the Leap Motion to generate a depth map. I did some initial search on the internet and couldn’t find a definitive answer about this being possible at all. I still thought it could be a nice way of having access to a compact and inexpensive alternative to the depth mapping functions of the Kinect. OpenCv has the StereoBM and StereoSGBM classes that could be used to generate the depth map.

I programmed a test in Processing and initially found that it didn’t work at all. I then thought this could be due to the images not aligning properly so I added to the sketch the option of shifting one of the images horizontally to the left or right by pressing keys and moving the image one pixel at a time. This gave some results that started to look promising since I was at least able to see some contours and areas that matched the objects (my hand) placed under the Leap with the depth represented as shades of grey.

After some more research on the internet I finally came across this post on the Leap Motion forum about someone else trying to do the same and getting better results. It seemed that the problem lies in the fact that the images from the Leap Motion are distorted due to the fish-eye type of lens mounted on the cameras. I also discovered that the Leap Motion API has some functionalities for correcting this distortion but at this point it seemed too involved of a process for the amount of time I had to develop this project so I decided to leave it for a future one.

 

Experiment 6: Colour tracking using a webcam

In this experiment I was looking into the alternative of using colour tracking to track floating objects on the surface of the water, viewed from above, and use that information for sound synthesis. This would allow someone to place floating cubes of different colours and use the colour and position information to create or modify different sounds. At this point I decided to switch to OpenFrameworks since I was already considering using other libraries like LiquidFun to generate the visuals (and as mentioned before, this seemed to perform much better in OF than in Processing).

The challenges were the expected ones: the webcam was too noisy and introduced a lot of randomness into the system, the light variations made tracking very inconsistent and unstable, I wouldn’t be able to use any kind of projection on the water or bottom of the container since this would add extra confusion to the system, background removal wouldn’t be a solution either due to the constant changes in the projection. I did manage to get some level of success with the tracking but not precise enough for my purposes. The reflections on the water also introduced a lot of noise into the system as seen in the experiment video below (the colour circles are the tracked positions and the different video views show the direct image from the web cam, the HSB decomposition and the thresholded binary feed sent to OpenCV).

 

Experiment 7: LiquidFun interactivity

Going back to using the Leap Motion infrared cameras to avoid interference by the projected video, I experimented with just tracking the object position from above and mapping that to the LiquidFun world to make the particles move. The experiment had a good level of success but the automatically controlled intensity of the LEDs in the Leap Motion caused large variances that made it hard to use. This variances where mostly caused by the hands entering the scene under the Leap and changing the amount of light reflected on the shiny acrylic surface. As an alternative I experimented with just using a webcam and hand gestures to interact with the particles, which worked very well.

At this stage I also got the head-tracker to communicate with the OpenFrameworks application so I tested the performance of having the tracker, OpenCV and LiquidFun working at the same time.

By now, I had finished building the wooden box and acrylic tray so I was able to test the performance of the Leap Motion placed inside the box behind the acrylic to detect touch. It was a successful experiment and the intensity of the LEDs remained stable due to not having any other objects placed in between the Leap and the projection surface.

 

Experiment 8 and conclusions: The acoustic ecology art experiment

Having must of the technical details figured out, I started thinking about the artistic content that this interactive platform could present. Making a connection between my personal interest in sound art (strongly inspired by exercising attentive listening of our surrounding environments) and the fact that water was already present as part of the interface, I decided to test using images and field recordings captured near bodies of water.  My initial thought was to gather visuals and sounds at different locations of the lake shore in Toronto but at the time there were strong gusting winds happening in that area that would have made sound recording very difficult. Then it came to my mind that ravines are usually found in lower level land and sheltered by trees so I went out scouting for a good location.

This experiment was very successful because even though I’ve known for long time about the work of other artists combining sound art with interactivity in an installation (and also done sound design for them), for the first time I’m producing my own interactive installation piece that I feel satisfied with and that accomplishes this combination. I really enjoyed working on this project, despite it’s difficulties and limitations of time. I’m looking forward to adding more content for this platform by gathering videos and soundscapes of a wider range of ravines. I will also be looking for venues where to present this installation in the near future.