Immersive Full-Surround Multi-User System Design

This paper describes our research in full-surround, multimodal, multi-user, immersive instrument design in a large VR instrument. The three-story instrument, designed for large-scale, multimodal representation of complex and potentially high-dimensional information, specifically focuses on multi-user participation by facilitating interdisciplinary teams of co-located researchers in exploring complex information through interactive visual and aural displays in a full-surround, immersive environment. We recently achieved several milestones in the instrument’s design that improve multi-user participation when exploring complex data representations and scientific simulations. These milestones include affordances for “ensemble-style” interaction allowing groups of participants to see, hear, and explore data as a team using our multi-user tracking and interaction systems; separate visual display modes for rectangular legacy content and for seamless surround-view stereoscopic projection, using 4 high-resolution, high-lumen projectors with hardware warping and blending integrated with 22 small-footprint projectors placed above and below the instrument’s walkway; and a 3D spatial audio system enabling a variety of sound spatialization techniques. These facilities can be accessed and controlled by a multimodal framework for authoring applications integrating visual, audio, and interactive elements. We report on the achieved instrument design.


Introduction
This paper presents design decisions and results from 5 years of ongoing research involving the AlloSphere [1,2], a three-story, immersive instrument designed to support collaborative scientific/ artistic data exploration and empower human perception and action.To support group experiences of research, working, and learning, we believe that computer systems need to accommodate physically co-located users in immersive multimodal 1 environments.We focus on research driving the full-surround, immersive, and multimodal aspects of the facility, allowing content to drive its technological development.Research in the facility is thus twofold: (1) multimedia systems design to develop a large, interactive, multimodal instrument, and (2) data generation, representation, and transformationusing a diverse set of applications to drive the development of the instrument 0 s capabilities for real-time interactive exploration.Our research maxim is that content drives technology, with no feature being added to our production system without first being explored in a prototype application.Our facility is designed to operate in two modes: desktop mode provides the opportunity to bring legacy content quickly into the system for rapid turnaround, while surround mode facilitates full-surround immersion (as shown in Fig. 1).
We believe that interdisciplinary teams encompassing the physical sciences, life sciences, social sciences as well as the arts will produce audiovisual data representations that will lead to increased understanding of large and complex biological systems, social networks, and other heterogeneous, high-dimensional information.The design process for our instrument and its computational infrastructure has thus been driven by the goal of providing multi-user capabilities supporting interdisciplinary research teams.
We designed, built, and equipped our facility using in-house planning and expertise, rather than relying on a commercial or integrator-driven solution.The physical infrastructure includes a large perforated-aluminum capsule-shaped screen (two 16-footradius tilt-dome hemispheres connected by a 7-foot wide cylindrical section) in a three story near-to-anechoic room.A 7-footwide bridge through the center of the facility provides space for up to 30 users simultaneously.The hemispheres 0 locations on the sides instead of overhead and underneath support the concept of looking to the horizon at the equator of the instrument 0 s infrastructure, while the joining cylindrical section avoids the inphase acoustic echoes that would be present inside a perfectly spherical structure.The perforated screen allows for the 3D spatial audio system as well as the multi-user tracking system to be placed outside the sphere.
Over the past few years, we have focused on true multimodality, attempting an equal balance among visual, audio and interactive representation, transformation and generation across a diverse set of content areas.We have also concentrated on fullsurround stereoscopic visual design as well as 3D spatial audio to increase immersion in the instrument.Visual calibration has been a key component of this work and we have achieved a seamless view across the multiple projectors lighting the sphere surface.Multi-user interaction using a variety of devices has been another active area of research and is detailed in this document.We believe that all these affordances facilitate immersive, multi-user participation.
The design of the facility is complemented by the development of a computational framework providing an integrated media infrastructure for working with visual, audio, and interactive data.It features a unified programming environment with components for creating interactive, 3D, immersive, multimedia applications that can be scaled from the 3-story instrument to laptops or mobile devices.We found that off-the-shelf VR software and game engines lack the flexibility to represent many forms of complex information (particularly in terms of audio [3]).Media languages such as Max [4] and Processing [5] work well for prototyping, but do not easily scale to large VR simulations.In addition, an inhouse, open-source approach was chosen to foster a development community around the facility and to prevent roadblocks in development.
A variety of scientific projects and artistic explorations have driven the design and implementation of the instrument and development framework.We present several of these projects that demonstrate multi-user, multimodal interaction and illustrate our efforts in interactive, immersive data modeling and analysis.

Related work
The history of unencumbered immersive visualization systems can be traced back to CAVE-like infrastructures designed for immersive VR research [6].These systems were designed to model virtual reality to real-world problems that allowed a user to move freely in the environment without the need for head-mounted displays and other devices that encumber the user 0 s sense of self [7].
CAVEs had their roots in scientific visualization rather than flight simulation or video games and were closely connected to high performance computing applications [8].Some of these environments were developed from CAVEs to six-sided cubes as in the StarCAVE [9] and Iowa State 0 s Virtual Reality Application Center.They also developed into multiple-room venues that include immersive theater-like infrastructures, video conferencing rooms, and small immersive working group rooms similar to a small CAVE.Facilities such as these include the Louisiana Immersive Technologies Enterprise (LITE) 2 and Rensselaer Polytechnic 0 s Experimental Media and Performing Arts Center (EMPAC). 3s the first VR environments were being designed for a number of varying applications that gravitated toward a single tracked user, smaller more low-cost immersive systems were developed [10][11][12].There now exist a plethora of systems from the desktop to plasma screens [13] and large high-resolution displays [14] that allow for immersive visualization in a number of fields.There are also a number of VR laboratories dedicated to specific applications, such as USC 0 s Institute for Creative Technologies, designed for multidisciplinary research focused on exploring and expanding how people engage with computers through virtual characters, video games, simulated scenarios and other forms of humancomputer interaction [15] or UC Davis 0 s KeckCAVES (W.M. Keck Center for Active Visualization in the Earth Sciences) [16].
A key difference of the instrument described in this submission to CAVEs and related VR facilities lies in the instrument 0 s ability to provide immersive and interactive surround-view presentations to a group of people 4 who can collaborate with different roles in data navigation and analysis.The screen geometry avoids visual artifacts from sharp discontinuity at corners, enabling seamless immersion even with non-stereoscopic projection, as shown in Fig. 2. Stereo content can be presented to a large set of users who participate in presentations from a bridge through the center of the facility.Users are generally positioned around 5 m distance from the screen, resulting in an audio and stereovision "sweet spot" area that is much larger than in conventional environments.
While we believe that there are many benefits to our instrument design we also acknowledge its limitations.For example, the bridge provides limited room for multiple users to move from one location to another, and so navigation of virtual spaces tends to consist of one user "driving" or "flying" the shared viewpoint with a handheld device, as opposed to, e.g., a (single-user) system based on head-tracking, which could allow navigation in a virtual space via walking, head movements, etc., and would also allow a user to walk all the way around a virtual object to observe it from all sides.Similarly, since every user sees the same left-and right-eye video regardless of location along the bridge, virtual objects closer than the screen appear to track or follow a user as he or she walks along the bridge.This means that correspondence between virtual 3D location (e.g., in an OpenGL scene) and real physical space depends on the viewing position, complicating gestural interaction with virtual objects.Another limitation is that there is almost no ambient light beyond projected content, so cameras used for vision recognition and tracking will be limited to the infrared spectrum.While we do have head tracking capabilities in the instrument, large groups of users are mainly facilitated in non- tracked scenarios.All in all, these design decisions were made specifically to favor the design of multi-user, participatory, immersive, data exploration environments.
Our facility is positioned between VR environments that give fully immersive experiences to a small number of users at a time and full-dome planetarium style theaters, which have extremely high outreach potential but limited capabilities for individual interaction and collaboration [17].There have been experiments with stereoscopy and interaction at several planetaria, and in some cases the use of stereoscopic presentation in production mode [18,19], but we believe that we are pursuing a unique combination of interactive group collaboration, stereographics, and multimodal immersion.

System overview
The AlloSphere has been designed and always used as an instrument.It is connected to a computing cluster, facilitating the transformation of computation to real-time interactive instrumentation.It was designed to minimize artifacts when representing information visually, sonically, and interactively in real-time.The capsule-shaped full-surround aluminum screen is perforated to make it acoustically transparent, allowing loudspeakers to be placed anywhere outside the screen.The instrument is acoustically and visually isolated from the rest of the building, and is suspended within a near-to-anechoic chamber to eliminate standing waves in the audio domain [20].
Multimodality is a key component for knowledge discovery in large datasets [21].In particular, almost all of our content complements visualization with sonification, attempting to take advantage of the unique affordances of each sensory modality.For example, while human spatial perception is much more accurate in the visual domain, frequency and other temporal perception benefit from higher resolution in the audio domain, so whenever depicting complex information that takes the form of frequency relationships or temporal fine structure, we always consider mapping those frequencies and structures into the perceptual regimes of pitch and/or rhythm.Sound also greatly supports immersion; in designing full-surround displays an important consideration is that we hear sounds from every direction but can see only a limited frontal field of view.
Since it is intended as an interactive, immersive, scientific display, our design attempts to smoothly integrate instrumentation, computation and multimodal representation, forming a seamless connection of the analog to the digital that can encompass heterogeneous forms of information, including measurements from instrumental devices as well as simulations of mathematical models and algorithms.

Research and production systems, surround and desktop views
Since our primary research goals include both media systems design and interactive immersive, multimodal, data exploration across content areas, we have to maintain two or more separate systems in many areas of the instrument 0 s infrastructure.The primary distinction is between research, the bleeding edge systems incorporating our best practices and latest technology, versus production, systems employing more popular, mainstream, and/ or easy-to-use technologies.While research is what advances the state of the art in media systems design, we believe that production is also vital to ensure that people can easily use the instrument and bring diverse content into it, as well as to provide a platform for content research that may be more familiar to domain researchers.
With this distinction in mind, and to provide flexibility for various uses of the instrument, we have engineered two separate video display systems.Our current desktop video system provides two large quasi-rectangular lit areas somewhat like movie screens on either side of the bridge, as Figs. 3 and 4 show in context.Each is lit by a pair of overlapping (by 265 pixels) projectors with hardware geometry correction and edge blending, resulting in a field of view of approximately 1271 (horizontal) by 441 (vertical).The aspect ratio of 127C44 % 2:89 compares favorably to the aspect ratio of the pixels: ð2 Â 1920 À 265ÞC1200 % 2:98, indicating that the hardware-warped content does not significantly distort the aspect ratio.
To balance the goals of immersion, lack of apparent geometric distortion (i.e., "looking rectangular"), and privileging many view positions along the bridge to support multiple users, this hardware warping is "to the hemisphere," meaning that parallel columns of rendered pixels fall along longitude lines of the screen and parallel rows along latitude lines.
Vitally, the "desktop" display mode provides the abstraction of a standard desktop-like rectangular flat screen driven by a single computer, allowing scientists and artists to start working in the instrument with their own content as seen on standard display types.One might wonder why we do not implement the desktop display mode by first calibrating the surround display and then rendering just the pixels of the desired quasi-rectangular areas; the reason is that such a solution would require an intermediate stage of video capture and distribution to multiple, coordinated rendering machines to perform warping, which would introduce additional complexity and latency.We provide powerful Linux (Lubuntu), Windows, and OSX machines to support a wide variety of software platforms including Max/MSP/Jitter [4], Processing, LuaAV [22], native applications, or even static videos (flat or stereographic), and web pages.In each case the operating system is aware that it outputs video to either two overlapping horizontal displays or else (for Windows and Lubuntu), via an Nvidia Quadro Plex to all four projectors simultaneously.
Audio outputs from these all-in-one production machines feed into the full audio system either through direct connection to specific speakers or by being fed into our audio rendering servers for software-controlled spatial upmixing.They can accept user input over Open Sound Control [23] from any device in the facility, or directly from mice and QWERTY keyboards accessible from the bridge.In short, almost any existing software that can take user input, output video, and/or output audio can do these things without modification in our instrument.In many cases it is not difficult to use both front and back projection areas, either by running separate copies of the software on two machines or by modifying the video code to render each scene also to a second viewport via a camera 1801 opposite to the "front" camera.Such modifications are trivial in many software platforms.
The surround system consists of audio and video rendering clusters providing synchronized full surround in conjunction with a real-time HPC simulation cluster.All content is distributed according to custom networking architectures resulting from the analysis of each project 0 s overall flow of information.The next section discusses the surround system in detail.
So far, most production content uses the desktop display mode, whereas a sizable range of research content is using the surround display mode.Some ongoing research, such as a project by one of the authors on analyzing network security data using non-stereoscopic visualizations in a situation room context [24], uses the desktop mode for ease of content development, but our authoring environments facilitate adaptation of such content for full-surround presentation.We will eventually streamline the development of full-surround content to the point that outside partners can easily import their content for use with this mode of presentation.

Video
In this section, we present the design and implementation of our video subsystem, which consists of an arrangement of two types of stereoscopic projectors.We discuss our solution for projector calibration, which because of the capsule shape of the screen differs from full-dome projector calibration [25].We also report on infrastructure requirements to maintain adequate noise and heat levels.

Video system
Front projection is necessary in our facility because the screen encloses almost the entire volume of the room.Currently we have implemented a 26-projector full surround immersive visual system.First we installed four Barco Galaxy NW-12 projectors (1920 Â 1200 pixel, 12k lumen, 120 Hz active stereo); these contain hardware warping and blending and comprise the desktop video system.The surround video system includes these four large projectors with hardware warping and blending turned off, plus 22 much smaller Projection Design A10 FS3D projectors (1400 Â 1050 pixel, 2k lumen, 120 Hz active stereo) located above and beneath the bridge, as Figs. 5 and 6 depict.Our informal early tests indicated that projecting polarized passive stereo onto our perforated projection screen resulted in drastically reduced stereoscopic effects as compared to a plain white screen, while active (shuttering) stereo worked equally well on both types of screens.We also believe that the physical constraints on projector placement outside of the users 0 bridge area would make it extremely difficult to line up two projectors for each area of the screen.
The perforated projection screens are painted black (FOVaveraged gain of 0.12) to minimize secondary light reflections and resulting loss of contrast [2].We had determined that each 12K lumens was therefore needed with the four-projector set-up.With the other 22 projectors all covering smaller areas, each 2K lumens gives a reasonable light balance among the 26-projector system.
The projector selection and placement is closely tied to the requirement of having a dual system supporting both desktop mode and full-surround mode.We designed a unique projector configuration that maximizes the size of the warped rectangular display on each hemisphere, while at the same time accommodating full spherical projection when the large display regions are blended with those of the additional projectors.The requirement of being able to drive the full production projection system from a single computer constrained this system to a maximum of four  displays, hence there being four WUXGA projectors, a side-by-side overlapping pair for each hemisphere.This four-projector cluster can be driven either by a single PC workstation via an Nvidia Quadro Plex (using MOSAIC mode to take advantage of the overlap function in the horizontal direction) or by a pair of PCs (one per hemisphere).
Several factors constrain the placement of these four projectors.The optimal placement for maximum coverage and minimal geometric distortion would be at the center point of each hemisphere.However, this is the viewing location and the projectors must be placed to minimize their impact on the user experience, namely on floor stands below the bridge structure.
Placement is optimized within the constraint of the available lens choices.Maximum coverage on a hemisphere is achieved with the widest available standard lens, which is a 0.73:1 short throw lens.The two projectors on either side are placed opposite to their respective screen areas such that the frusta are crossed.This increases the distance to the screen while allowing the placement to be moved forward such that the lenses align with the front edge of bridge one on either side.They are placed at the maximum height, limited by the bridge clearance, and with the lenses moved close to the center in the lateral axis.As the lenses are offset in the projector body, the placement is offset asymmetrically to compensate.The pitch is set to 421 to point up toward the centerline of the hemispheres, and the roll axis is tipped 51 (the maximum allowed by projector specifications) to spread the lower corners of the covered area further increasing the available rectangular area.
When geometry correction is disabled, the projected area meets the corners of the doorways at either end of the space and overlaps in the center leaving a single, connected upper dome region and a separate lower area on each hemisphere uncovered.The eight Projection Design A10 FS3D projectors in the overhead area of each doorway at the ends of the bridge (shown in Fig. 5) cover the upper dome region, and fourteen more of these projectors placed below the bridge cover almost all the lower portion of each hemisphere.We first arranged the coverage areas in a symmetrical fashion with each projector overlapping its neighbors, then further adjusted in an asymmetrical arrangement to optimize the size and shape of the overlapping regions to facilitate smooth blending.Our criteria for arranging the overlap are twofold: avoid there being more than three projectors lighting any given area of the screen (because more overlapping projectors means a higher black level and a lower contrast ratio), and maximize the area of any given overlapping region (because larger overlap regions can be blended with more gradual changes in alpha map values).Fig. 8 is a photograph showing how the current projectors light up the screen and Fig. 9 is a diagram of the    resulting pixel density across the entire screen.Projector placement beneath the bridge and over the bridge doorways provides complete visual coverage in some of the most difficult areas of the screen.
The 26 projectors receive video signals from 13 Hewlett Packard z820 workstations each containing two Nvidia K5000 stereo graphics cards; together these form our first prototype fullsurround visual system.For the desktop visual system (described in Section 2.1), a separate HP z820 machine drives all four Barco projectors via an Nvidia Quadro Plex containing Nvidia K5000 graphics cards.Barco Galaxy NW-12 projectors each have two DVI inputs and can switch between them (as well as turn on and off hardware warping and blending) via commands sent via ethernet.Thus the surround and desktop display modes use different computers but some of the same projectors, and we can easily switch between them under software control.

Video timing synchronization
The integration of these 26 projectors forced us to synchronize two different projector technologies that to our knowledge have not been integrated before.The Barco projectors have a single 120 Hz interleaved left and right eye input where the Projection Design (PD) projectors have a dual channel 59.98 Hz left and right eye input.To synchronize these two subsystems, a Quantum Composers model 9612 pulse generator acts as a dual-channel house sync source.The PD projectors operate only within a narrow frequency range around 59.98 Hz.The Barcos are more tolerant of varying input rates and can receive 119.96 Hz (2 Â 59.98) with the appropriate ModeLine in the xorg.conffile.The pulse generator has two outputs, T1 and T2, which we dedicate respectively to the 199.96Hz and 59.98 Hz projection subsystems.Though these share the same period ð1=119:96 HzÞ, each has its own pulse width.T1 0 s width gives it a 50% duty cycle at 1/119.96Hz: W ¼ 1=2nð1=119:96 HzÞ % 4:1681 ms.T2 0 s width is set to 0.4 μs longer than the entire period, so that it just misses every other rising edge and therefore runs at half the frequency (59.98 Hz).Table 1 shows all these settings, which are saved to memory to load each time the unit is powered on, making the system more robust to power failure.
According to the Nvidia Gen-Lock mode of operation, a display synchronized to the TTL house-sync input is a "server" display, whereas the multiple displays synchronized over the Frame-Lock network are "client" displays.In order to combine the Gen-Lock and Frame-Lock networks, the server display must reside on a computer for which it is the only display for that machine.Therefore we dedicate two additional (low-end) computers to provide synchronization for our system, each acting as a "server" display, one at 59.98 Hz and the other at 119.96 Hz.Each of these machines accepts the appropriate house-sync signal from the pulse generator at the TTL input of an Nvidia G-sync card and provides the Frame-Lock signal to be sent to the video rendering workstations at the RJ45 output of the same board.Thus, we can synchronize a Frame-Lock network to the house-sync by isolating the "server" display to a screen that is not seen in the sphere.Furthermore, the entire system is robust to power cycles by being configured to initiate synchronization on startup.The overall result is that all projectors and shutter glasses switch between left-eye and right-eye at the same time, so that stereographics work seamlessly throughout the instrument.

Pixel density
We have analyzed the estimated 3D positions of all projected pixels output by the calibration method described in Section 3.2.Fig. 9 is an equal-area projection of a map of pixel density (i.e., pixels per steradian) in each direction as seen from a standing position in the center of the bridge.Values range from zero in the uncovered sections to a maximum of about 15 M pixels per steradian.(Naturally the areas with lower pixel density have correspondingly larger pixel sizes.)We see that almost the entire screen (minus the two doorways) is lit down to over 601 below the horizon.Of course the pixel density varies greatly in overlap regions compared to regions covered by a single projector; we tried to discount this by weighting each pixel linearly by its alpha value, but the overlap regions are still clearly visible.We also see a smooth gradient of pixel density along the images projected by the Barcos, since the throw distance varies greatly between the top and bottom rows of pixels.
We also see that each Barco projector lights an area much greater than any Projection Design projector, so that even with the Barcos 0 greater resolution (2.3 M vs. 1.47 M pixels each), the pixel density is significantly greater in the overhead dome and especially below the areas the Barcos cover.This is a result of the need for the desktop display mode to use few enough pixels to make realtime rendering practical from a single machine for production content.While this design facilitates use of the instrument, it poses limitations to the visual resolution of the display at the most important area (i.e., where people naturally and comfortably rest their gaze).Fig. 10 is a histogram showing the distribution of the area covered by each pixel in the instrument, and Table 2 gives the minimum, mean, and maximum of the pixel areas for each of the 26 projectors and for the instrument as a whole.Our estimate of pixel area again starts with the 3D coordinates of the estimated "center" of each pixel as output by the calibration method described in Section 3.2.We neglect the local screen curvature in the region of each pixel (approximating it as planar) and model each pixel as a parallelogram.The (approximately) vertical vector that is equivalent to two of the sides of the parallelogram is half of the vector difference between the pixels immediately above and below, and likewise in the other direction.The cross product between these two vectors gives the area.Given the estimated position pr;c of the pixel at row r and column c of a given projector, the estimated area is given by vertical This estimate ignores a one-pixel-wide border around each projector.
Currently the 26 projectors give an uneven distribution of approximately 41.5 million pixels.We believe that achieving eyelimited resolution in the instrument requires a minimum of approximately 50 million pixels evenly distributed on the sphere surface, which will probably require completely separating the desktop and surround display systems by adding additional projectors to the surround display system to light the areas currently covered only by the Barcos.

Video calibration and multi-user surround stereographics
We deployed software for calibrating and registering multiple overlapping projectors on nonplanar surfaces [26].This software uses multiple uncalibrated cameras to produce a very accurate estimate of the 3D location of each projector pixel on the screen surface as well as alpha maps for smooth color blending in projector overlap regions.We use 12 cameras (shown in Fig. 8) with fisheye lenses to calibrate our 26-projector display into a seamless spherical surround view.First we calibrate our fisheye cameras to be able to undistort the images they produce.Then standard structure-from-motion techniques [27] are used to recover the relative position and orientation of all the adjacent camera pairs with respect to each other, up to an unknown scale factor.Next, stereo reconstruction recovers the 3D locations of the projector in the overlap region of the cameras.Following this, through a non-linear optimization, the unknown scale factors and the absolute pose and orientation of all the cameras are recovered with respect to one of the cameras that is assumed to be the reference camera.This allows us to recover the 3D location of all the projector pixels in this global coordinate system using stereo reconstruction.Finally, in order to find a cameraindependent coordinate system, we use the prior knowledge that there are two gaps in the screen at the beginning and end of the bridge corridor (see Fig. 5).Using this information, we recover the 3D location of the corridor and align the coordinate system with it such that the corridor is along the Z-axis and the Y-direction is upwards.
The recovered 3D locations of the pixels are then used to warp the images such that overlapping pixels from the different projectors show the same content.However, the method of warping provided (based on a projection matrix and UV map per projector) does not scale well to surround stereoscopic projection.Hence, we developed alternative systems based on the same projector calibration data.The solution principally in use renders the scene to an off-screen texture and then applies a pre-distortion map from this texture to screen pixels in a final render pass.We are also currently refining a second solution performs the pre-distortion warp on a per-vertex basis "while" rendering to the screen in a single pass.As noted in [28], warping by vertex displacement is in many cases more efficient than texture-based warping, avoiding the necessity of multiple rendering passes and very large textures (to avoid aliasing).The principal drawback of vertex-based pre-distortion is incorrect interpolation between vertices (linear rather than warped).This error was apparent only for extremely large triangles, and was otherwise found to be acceptable (because incorrect curvature draws less attention than a broken line).Using higher-polygon-count objects or distance-based tessellation reduces the error.Looking toward a future of higher-performance rendering, we have also implemented a third solution of physically based rendering using the results of the projector calibration in which the entire scene is rendered with raycasting and ray-tracing techniques, incorporating the OmniStereo adjustments for full-dome immersion at interactive rates (see Fig. 1).
Where the classic, single-user CAVE performs stereoscopic parallax distortion according to the orientation of the single user (e.g., by head tracking), in our multi-user instrument no direction can be privileged.Instead, we employ a 3601 panoramic approach to stereoscopics along the horizontal plane.This results in an ideal stereo parallax in the direction of vision but is compromised in the periphery, in a similar fashion to OmniStereo [29].The stereo effect is attenuated with elevation, since at the apex of the sphere no horizontal direction has privilege and it is impossible to distinguish "right" from "left."We found panoramic cylindrical stereography through the OmniStereo [29] slice technique to present an acceptable stereo image, but to be prohibitively expensive due to repeated rendering passes per slice.Reducing the number of slices introduced visible, sharp discontinuities in triangles crossing the slice boundaries.Panoramic cylindrical stereography through per-vertex displacement on the GPU proved to be an efficient and discontinuity-free alternative (with the same benefits and caveats as for vertex-based pre-distortion outlined above).

Projector mounting, sound isolation, and cooling
We custom fabricated floor stands for the Barco projectors with channel-strut steel and standardized hardware (shown in Fig. 7).The projectors are massive (70 kg / 154.2 pounds each) and need to be placed at an overhead height, so we designed rigid fourlegged stands with a large footprint for high stability.Cantilevered beams made from double-strut I-beams atop the legged frame allow the projector placement to extend over the lower portion of the screen.The beams are hinged to the leg structure for proper incline of 421, and swivel brackets join the projector mounting plates to the cantilever beams to allow for the roll angle of 51.
In order to preserve the audio quality within the instrument, we must isolate the noise of equipment located within the nearto-anechoic chamber.Since front projection is our only option, the projectors reside inside the chamber (and indeed inside the sphere).The large Barco projectors located beneath the bridge (as shown in Figs. 5 and 7) generate by far the most noise.
The sound isolation enclosures provided by the projection company needed to be re-engineered due to our stringent specifications of noise floor within the chamber.A rear compartment of the enclosures was engineered to act as an exhaust manifold with acoustic suppression.The compartment was lined with AMI "Quiet Barrier Specialty Composite," a material which achieves a high level of noise abatement with a sandwich structure of a high density loaded vinyl barrier between two lower density layers of acoustical foam.An aluminized mylar surface skin provides thermal protection for use at elevated temperatures.The heated exhaust from the Barco Galaxy 12 projectors collects in this manifold compartment.
We removed the very loud factory-supplied fans and instead added an exhaust duct at the output where we attached 6-in.diameter insulated ducting.Low noise in-line duct fans (Panasonic Whisperline FV-20NLF1 rated at 240 cfm with a noise specification of 1.4 sones) draw the hot exhaust air from the enclosure out through the original fan ports to the room 0 s HVAC intake vents.Fig. 7 shows one projector in its modified enclosure with ducting and an in-line fan.
Table 3 shows a series of audio noise measurements with various equipment on or off and also comparing the noise from the original Barco projector enclosures to our redesigned enclosures.Our custom design reduced the projector noise by 13.3 dB, and we believe that we can reduce it even further by isolating the noise of the cooling fans.

Audio
We have designed a series of loudspeaker layouts to support multiple sound spatialization techniques including Wavefield Synthesis (WFS), Ambisonics, Vector Based Array Panning (VBAP) and Distance Based Array Panning (DBAP) [30,31].
Currently we are using the third prototype audio system containing three rings of Meyer MM4XP loudspeakers (12 each in the top and bottom plus 30 in the middle for 54 total) plus one large Meyer X800 subwoofer, driven by five AudioFire 12 firewire 400 audio interfaces from Echo Audio connected to a MacPro.Our fourth prototype will add almost 100 more MM4XP loudspeakers to the existing 3-ring design, planned at 100 speakers on the horizontal to support WFS plus 20 each in the top and bottom rings, and has been mapped out in CAD to help plan the installation.
To keep down the audio noise floor, the speakers 0 power supplies (Meyer MPS-488), along with the audio interfaces and the audio rendering computers, are located in an acoustically isolated equipment room on the ground floor of the facility, outside of the near-to-anechoic chamber.Since each loudspeaker carries an independent audio signal, one cable per loudspeaker comes up through the ceiling of this equipment room into a cable tray and then to the speaker 0 s position outside the screen.We plan to eventually isolate all video and audio rendering computers in this machine room.
A sixth Echo AudioFire 12 interface attached to the production Lubuntu box allows audio rendering from the same single computer that can drive the four Barco projectors.These 12 audio output channels go to 12 of the 60 audio inputs on the five AudioFire 12 boxes connected to the MacPro.Having realtime audio along with 10G ethernet connection between these two machines supports several audio rendering architectures along a spectrum of distributed computing complexity, including directly addressing 12 of the 54.1 speakers, a static 12:56 matrix upmix, taking the 12 output channels as inputs to network-controlled dynamic sound spatialization software [32] running on the Mac-Pro, and encoding any number of dynamic sources to second-order Ambisonics on Lubuntu with a 54.1 decode on OSX.
We have designed our own custom speaker mounting hardware (shown in Fig. 11) according to our acoustic studies and spatial configuration discussed above.The mounting system is designed to prevent sympathetic vibrations so that there is no speaker buzz.

Ensemble-style interaction and the deviceserver
We use the term "ensemble-style interaction" to describe our approach to multi-user interactivity, by analogy with a musical ensemble [33] .At one extreme, one user actively manipulates the environment via interactive controls while other users observe passively.We also support many other models in which multiple users adopt various roles and then perform associated tasks concurrently.One form consists of a team of researchers working together across the large visual display, each researcher performing a separate role such as navigation, querying data, or modifying simulation parameters.Another configuration gives each researcher an individual tablet display while immersed in the large display system.These tablets can both display a personalized view of specific parts of the information and also provide the ability to push a new view to the large display to be shared with other researchers.In order to simplify incorporating multiple heterogenous interactive devices in VR applications we developed a program named the DeviceServer to serve as a single networked hub for interactivity [34,35].The DeviceServer removes the need for content application developers to worry about device drivers and provides a simple GUI enabling users to quickly configure mappings from interactive device controls to application functionalities according to their personal preferences.Multiple devices (e.g., for multiple users) can be freely mapped to the same application, e.g., each controlling different parameters, or with inputs combined so that multiple devices control overlapping sets of parameters.This scheme offloads signal processing of control data onto a separate computer from visual and audio renderers; all signal processing is performed via JIT-compiled Lua scripts that can easily be iterated without having to recompile applications.Interactive configurations can be saved and quickly recalled using Open Sound Control [23] messages.

Tracking and other devices
There is a 14-camera tracking system [36] installed in the instrument, which can track both visible and infrared LEDs.Fig. 12 shows a researcher using LED gloves tracked by the system.Integrating the tracking system into the overall design required careful consideration.Cameras must be located behind the screen, so as not to block visual projection, but also must be positioned in a way that affords tracking multiple users on the bridge simultaneously.Custom mounts were designed for the cameras in order to hold their twin apertures directly in front of screen perforations, which were slightly widened to increase the cameras' field of view.These mounts attach to the screen via machine screws that insert directly into nearby screen perforations.Of the 14 cameras, 10 are currently mounted in a ring around the outside surface of the top of the sphere, with the remaining 4 mounted in the openings on either side of the bridge.
The emitters used with our active stereo projectors and glasses also use infrared light, and out of the box there is interference such that glasses in line of sight of IR tracking LEDs are not able to synchronize.Luckily there is enough frequency separation between the wavelengths of the two sources of IR light that we were able to solve this problem with optical filters attached to the IR receivers of the shutter glasses.We tested two types of filters: Long Wavepass Filter (LPF) and Schott Color Glass Filter (CG).Although the long wavepass filter had the better bandpass range for our application, the problem is that this type of filter is directional, correctly blocking interference from IR LEDs at certain head angles but not at others.In contrast, the performance of the color glass filter does not depend on direction, and these allowed perfect operation of the shutter glasses alongside the IR LEDs even though they pass the highest IR frequencies (containing about 25% of the energy from the emitters).
Other devices are being continuously integrated into the instrument in order to augment multi-user control of applications.Recently, an array of Microsoft Kinects was installed to scan users on the bridge and re-project them within the artificial ecosystem of the Time of Doubles artwork [37], as Fig. 13 shows.
In addition to providing interactive controls to multiple users, our current research also gives users individual viewports into data visualizations [38].Using tablet devices, users can interactively explore detailed textual information that would otherwise disruptively occlude the shared large-screen view.Fig. 17 shows multiple researchers using tablets to explore a graph visualization of social network data.Each user has a tablet controlling a cursor on the large shared screen to select individual nodes in the graph.The textual information associated with selected graph nodes then appears on the tablet of user performing the selection.When users find information that they think would be interesting to others they can push the data to the shared screen for everyone to see.
Mobile devices interact with applications using the app Control [39], available for free from both the Apple App Store and the Android Market.Control is our open source application enabling users to define custom interfaces controlling virtual reality, art, and music software.

Projects tested displaying multi-user capabilities
We believe that use of the system through developing our research content is the most important driver of technology [40].Over the past 5 years we have focused on projects crossing diverse content areas that facilitate the development of multimodality, multi-user interaction, and immersion.Of our many successful projects, here we will describe a small subset that focuses on multi-user group participation as described above.

AlloBrain
The AlloBrain research project (shown in Fig. 1 (left) and Fig. 14) gives roles to an ensemble of researchers for collaborative data exploration while immersed in the fMRI data both visually and sonically.One user navigates the shared viewpoint with a wireless device while other people use various devices to query the data.

TimeGiver
The TimeGiver project (Figs. 15 and 16) explores multi-user audience group participation in the desktop display mode.Audience members download a custom biometric app to their smart phones, made specifically for this interactive installation, that uses the phone 0 s LED and camera to obtain a photoplethysmogram (PPG) that captures heart rate, blood flow, level of blood oxygenation, etc.The app can also interface with low-cost off-the-shelf electroencephalography (EEG) sensors to monitor brainwave activity.These time-varying physiological data dynamically determine the visual and the sonic output of the installation.

Graph browser
The GraphBrowser application (Fig. 17) enables multiple users to collaboratively explore annotated graphs such as social networks or paper coauthorship networks.The desktop display mode shows the full graph stereographically, while tablet devices held by each researcher display individualized additional textual information.There are two roles for researchers in this application: navigation and node querying.Navigation controls allow a navigator to rotate the graph, move the virtual camera and manipulate global parameters of the visualization presented on the shared display.Concurrently, additional researchers can select nodes and query them for associated textual data and view the query results on personal tablets.By displaying text on tablets we avoid occluding the shared display with text that is particular to individual researchers and also provide a more optimal reading experience by enabling individuals to customize viewing distance and text size.
In order to foster collaboration the shared display shows a visual browsing history of each user.Each researcher (actually each tablet device) has a unique associated color, used both for a selection cursor on the shared display (which the user moves via touch gestures on his or her tablet) and also to mark previously queried nodes.This strategy helps researchers to identify unexplored areas of the graph and also provides contextual awareness of the other users 0 activities.We also enable individuals to push data they deem of interest to collaborators from their individual tablet to the shared display for everyone to analyze     simultaneously.Fig. 17 shows two researchers exploring social network data using tablets and the shared display.

Copper tungsten
Our series of Copper Tungsten visualizations employs both desktop and surround display modes to give our materials science collaborators different ways to view the same volumetric dataset.These scientists are familiar with volumetric visualizations that are 3D but not stereoscopic.The first, "Slice Viewer" (Fig. 18), is inspired by tools commonly used to view MRI volumetric datasets; it uses the desktop display mode to show three interactively movable, orthogonal slices through the volume.The left half of the display shows the three slices in context in perspective, while the right half shows the same three slices in a flat (viewportaligned) fashion so that detail will be most apparent.
The second (Fig. 19), also using the desktop display mode but with stereographics, is a volumetric rendering of the dataset taking advantage of alpha-blending (translucency) to be able to see into the volume.Unfortunately the size and amount of detail of this dataset makes it impossible to apprehend the entire 3D volume visually; occlusion makes it difficult to see the inside structure of the volume.
The third visualization of this dataset uses the surround display mode in conjunction with raycasting rendering in a distance field, allowing the researchers to "go inside" the dataset rather than view it from a perspective looking in from outside.

Preliminary conclusions from the various projects
As we build out the system with a diverse set of content areas driving the design, we believe that there is a common set of benefits of our instrument.First and foremost, multiuser group interaction in an environment in which the users are unencumbered by technical devices seems to facilitate natural communication among groups of researchers.Not only does each user have his or her own sense of self while immersed in a dataset, but also each user has a sense of the other users 0 selves, which seems to facilitate communication within the group.With the instrument design mimicking real-world immersion, namely looking to the horizon, having no visual corner artifacts, full surround audio, and various forms of interaction including gestural control, we believe that a group of researchers can interact and can be immersed in a complex dataset much in the same way that they are immersed in the real world.Through these projects we have found that this instrument design facilitates immersion even in scenarios that are non-stereoscopic (for example when viewing panoramic photographs as shown in Fig. 2).

Conclusions and future work
Technology development has been intricately linked with system use throughout our ongoing research in this large-scale, full-surround, immersive, multimodal instrument.The plurality of programming environments supported by the desktop-like display mode facilitates easy access to the use of the instrument, while the in-house authoring software scales easily from single-screen to full-dome immersive display.A notable benefit of this approach has been the low barrier of entry for developing content.We continue to build the in-house infrastructure as an active research area.
A vital component of future work is the evaluation of the effectiveness of the instrument across heterogeneous content areas using immersion, multi-user interaction and multimodality.As we scale up the instrument another important research area will include a better authoring environment for surround mode.We have an effective way of bringing in legacy content and we now focus on full-surround, omnistereo, and real-time physically based rendering.
We are currently prototyping how multi-user, real-time metaprogramming can be applied in our intensely demanding multimedia environment.Our goal is that multiple researchers (artists, scientists, technologists) can write and rewrite applications as they are immersed within them without pausing to recompile and reload the software [41], and simply by opening a local network address on laptop or mobile device browser to view code editors and graphical interfaces.Changes from multiple users are merged and resolved through a local Git repository, and notifications broadcast to all machines of the rendering cluster, with live C/ Cþ þ code changes recompiled on the fly.
As we continue to build the instrument through content research, we will scale to many different platforms and devices from large immersive full-dome display to mobile platform devices, specifically focusing on 3D and immersion.The different scaled platforms will be connected together through our software infrastructure to make a multi-dimensional interconnected system from large full-dome instruments to small mobile devices that will be utilized as windows within windows for multiple resolutions of scale.We imagine an interrelated network where live-coding will facilitate communities of digital interactive research across many different application areas.

Fig. 2 .Fig. 3 .
Fig. 2. Fisheye photograph of a group of researchers immersed in full surround non-stereoscopic data.

Fig. 4 .
Fig. 4. Wide-angle photograph of the Time of Doubles project from the bridge of the instrument.

Fig. 5 .
Fig. 5. CAD model with virtual translucent view from just outside the instrument, showing locations of 12 of the 26 projectors and most of the 55 loudspeakers.

Fig. 6 .
Fig. 6.Another CAD model with virtual translucent view from outside the instrument, showing locations of 24 of the 26 projectors.

Fig. 8 .
Fig. 8. Fisheye photograph from the bridge showing most of the 26 overlapping projection areas and calibration cameras mounted to the bridge railing.

Fig. 9 .
Fig.9.Map of pixel density (pixels per steradian) as seen from standing height at the center of the bridge.X is the longitude and Y is the latitude; Y 0 s nonlinear spacing is because this is an equal-area projection, in other words each unit of area of the image represents the same solid angle on the screen.Each pixel 0 s contribution is weighted by its alpha (blending) value (0-1) to discount the extra pixel density that occurs in projector overlap regions.

Table 3 5 Fig. 11 .
Audio noise measurements (dB SPL, A-Weighted, from center of bridge) as more equipment is turned on.Below the line are older measurements taken with original unmodified projector enclosures.Condition dB All equipment turned off 28.6 Panasonic fans on 33.2 Fans and Barco projectors on 40.9 Entire current system on 43.2 Everything off except original fans in factory projector enclosures 49.0 Barcos on inside factory enclosures 56.Meyer MM4XP loudspeaker on custom mount.Left side of image shows sound absorption materials and right side shows the back of the projection screen.

Fig. 12 .
Fig.12.Researcher using the tracked gloves to explore fMRI brain data.

Fig. 13 .
Fig.13.Two visitors feeding and being consumed by artificial life organisms in the Time of Doubles artwork (2012).Visitors 0 occupation of physical space is detected by an array of Kinect depth cameras, and re-projected into the virtual world as regions of nutritive particle emanation, while physical movements cause turbulence within the fluid simulation.

Fig. 14 .
Fig. 14.Multiple users with wireless devices and gestural control mining fMRI data.

Fig. 15 .
Fig. 15.The TimeGiver project maps audience participants 0 EEG and PPG temporal patterns to create an immersive audiovisual installation.

Fig. 16 .
Fig.16.Close-up of two participants in the TimeGiver project using their smart phones to monitor blood pulse via PPG; the person on the right is also wearing a head-mounted EEG device.

Fig. 17 .
Fig.17.Tablets providing personal views and search and annotation tools in GraphBrowser, a project for collaborative graph exploration.Left: photo of two users interacting with the system.Center: graph as it appears on the shared display, with three color-coded cursors and already-visited nodes highlighted.Right: textual data and a graphical representation of already-visited nodes, as would appear on a tablet.(For interpretation of the references to color in this figure caption, the reader is referred to the web version of this paper.)

Table 1
Pulse generator settings for our custom synchronization system.

Table 2
Pixel area statistics per projector.Left column is projector number (same numbering scheme as Fig.8); numbers 9-12 are Barcos.Units for other columns are square centimeters.