How does it work?
Digital information in the form of bits are modulated (typically ON-OFF Keying or Pulse Width Modulation) onto optical signals at the transmitter that are then processed over a temporal sequence of image frames by a camera receiver. However, the paradigm shift in this design is that image analysis is used to aid in the demodulation of bits contrary to the direct processing of electrical signals in traditional communication system design.
Challenges in the Optical Spectrum
Optical wireless requires very narrow beams to achieve longer ranges because the signal-to-noise ratio is limited by several factors. First, transmission power levels are lower than in the RF spectrum because of output power limita- tions of LED technology and eye safety restrictions for laser transmitters. The optical spectrum is characterized by high background noise typically by sunlight in the infrared and visible light wavelengths and other IR heat sources in vicin- ity. Optical wireless communications with narrow beams has hitherto been impractical in most mobile settings, because both the sender and receiver need to operate with very narrow beams and angles-of-view, respectively, to achieve transmission ranges greater than a few tens of meters.
How does the MIMO analogy help?
The pixels in a camera is essentially an array of photodiodes and the camera lens provides a different narrow field of view for each photodiode. This creates a large number of highly directional receive elements (the camera pixels), which allows reducing interference and noise and thereby can achieve large ranges, yet still maintain the wide field-of-view necessary for tolerating mobility.
Visual MIMO Communication Model
A visual MIMO system with K transmit elements in the LEA can be modeled as,
where Y is the image current matrix with each element representing the received current in each pixel, xk represents the transmitted optical power from kth element of the LEA and Hk is the channel matrix of the kth transmit element of the LEA, with it's elements representing the channel between the kth transmit element and each pixel, and N is the noise matrix (pdf).
Camera v/s Photodetector receiver
Cameras allow a degree of flexibility of selectively combining pixel elements that receive a strong signal from the light emitting elements which cannot be attained with a single photodetector receiver. Even in the absence of scene noise, we have been able to show analytically (pdf) that a camera receiver has a gain in its signal-to-noise ratio (SNR), and hence it's data-rate, over a single photodiode receiver at short distances (see plot to the left). The tradeoffs in the visual MIMO system, however, are a limited receiver sampling frequency (e.g., hundreds to thousand frames per second for lower end cameras and a million frames per second for high-end models) and, as in all optical wireless communications, strong line-of-sight (LOS) requirements. By transmitting using multiple elements of the LEA such camera based communications can take advantage of data rate gains from multiplexing and/or diversity (see plot to the right) using such an inherent multiple input multiple output (MIMO) setting in the system.
Real World Challenges
To realize the potential data-rate gains, the visual MIMO system needs to identify which set of photodiodes (pixels) receive the signal, or equivalently, which region of the image contains LED transmitters. The unique feature of the Visual MIMO approach is the fact that the receiver processing is based on image analysis. But we are faced with some challenges in processing a real world scene such as, (1). Camera motion, (2). Illumination variation (the appearance of the LEA transmitter changes with illumination variation in the scene), and (3). Background distractors (such as many other objects in the scene). Using computer vision (CV) algorithms we can locate the LED transmitter in the presence of background distractors using (a) recognition and (b) tracking.
Visual Channel distortions and the Rate Adaptation Problem
Distortions in real world scene are typically observed as distortions in the size and shape of the image, partial visibility of an image and even interference between images of two different transmitter due to image blurring. This suggests that the throughput of visual MIMO links can be significantly improved through rate adaptation techniques, which adapt the transmission data rate to the receiver perspective. The rate adaptation problem then is to choose transmission modes that exploit the available parallel channels while keeping the error rate low. A transmission mode is a certain assignment of multiplexing and diversity functions to the set of light emitting elements (an example is shown below). Our Visual MIMO Rate Adaptation algorithms (VMRA), Probe VMRA and Index VMRA use special probe and a block-CRC indexing scheme respectively to detect occlusion efficiently compared to an exhaustive search for occluded light emitters in each image (pdf).
Envisioning a Visual MIMO Network
A Visual MIMO network is typically characterized by highly directional transmitters and receivers, strict line-of-sight re- quirements, and a perspective dependent multiplexing gain and thus opens up a wide spectrum of opportunities for challenging research even at the link and net- work layers.
-
Perspective Dependent Multiplexing gain
Use of Geometric Information
Visual Ranging and Network Localization
Visual Multi-path transmissions
The receiver can distinguish all LEDs when it has a full frontal view on the trans- mitter array at close distance where bits can be multiplexed over all light emitting elements. At a large distance or at from an angled view, the LEDs will blend together in the image where the diversity mode is an option.
At the physical layer, for example, the transmitter could pro- vide the receiver with information about the transmitter LED array geometry (i.e., an LED template) to assist the receiver in recognition, tracking, and demodulation. Ge- ometry is also useful at the network layer because, unlike for RF wireless channels, link bitrates are quite predictable given network geometry (distance and orientation between nodes).
Given a known LED template, distance and angle information can be generated through camera pose es- timation, an image analysis techniques. It is worth studying whether accuracy can be improved through particular sig- naling techniques or additional information from the transmitter, particularly under partial occlusion or FOV-clipping of the LED array.
Consider a scenario with three nodes;a source, a destination,and one potential relay, which is positioned in-between the two other nodes but closer to the source than the destination (without obstructing line of sight between source and destination). Because shorter distances allow higher multiplexing gain, it is likely that the link capacity between source and relay is greater than the others, Csr > Crd > Csd. Thus, the multi-hop path sr-rd has higher capacity than the direct link sd, but the highest throughput can be achieved through simultaneous transmission through the relay and on the direct link.