An Analysis on the Impact of Video Traffic over Wireless Visual Sensor Network under Mannasim Sensor Network Architecture

-The monitoring capability of a Wireless Sensor Network (WSN) constitutes an important feature making its application multidisciplinary. In a common WSN, the network handles only simple scalar data and thus it is sufficient for complex applications such as object detection, surveillance, image recognition, localization, and tracking. Therefore, in addition to normal sensing network infrastructures, the functionality of a WSN can be improved by adding vision capabilities. The existing hardware and software specifications of a typical WSN node are determined only for handling simple tasks. The main objective of this work is to extend the WSN to a Visual Sensor Network (VSN) by adding some visual sensor nodes in the traditional WSN infrastructure and study the video traffic impact to the overall performance with respect to different metrics. For these purposes, a sensor network with some Mica2 Motes with added visual sensors acting as cluster heads as well as routers was simulated herein. The generated data by the Mica Motes and the visual sensors nodes will be collected by separate sink nodes. The performance of the network will be studied with respect to different number of simultaneous video connections. The impact of video traffic over the scalar data flow as well as the overall performance will be analyzed with suitable metrics.

Location and Orientation Information -Most of the algorithms regarding image processing algorithms need information about the position of the camera nodes and the orientation of the cameras. Time Synchronization -A captured image may include meaningless information in cases that is not followed with the information regarding the time that the image was captured. That means that time synchronization is very essential on VSNs. Data Storage -The cameras create large amounts of data over time, which need to be stored for later analysis. Camera Interplay -Precise information regarding the position and orientation of the camera is vital for many algorithms used in VSNs. This information is achieved during the calibration phase by processing a set of special markers. The adjustment of cameras' settings can be performed in a processing center, which gathers all the special marks from each camera. C. Performance of VSNs The performance of a VSN depends on many factors which have already addressed from other researchers [6], [9]- [14]:

II. SIMULATOR SELECTION
There is a variety of simulators available for simulating wireless networks, but almost all the existing network simulators will not support WAN simulation completed. Moreover, on the authors' best knowledge there is not a commercial or freely available simulator for simulating all the features of VSNs. Most of the resent simulations based studies related with WSNs and VSNs, use NS-2 and OMNeT++. So, in this section we present and compare the NS-2 and OMNeT++. A. Mannasim -Network Simulator (NS-2) [Simulator I] The NS-2 is one of the most famous network simulators widely used in many researches. The NS-2 covers almost all of the IP protocols, but does not support all of the features needed to simulate a WSN [15], [16]. Specifically, the WSNs' characteristics that can be simulated by the NS-2 simulator are the following [15]- [17]: Another notorious simulator is OMNeT++ [15]- [17], [19] which simulates wired and wireless network. The components that uses are based on C++, although models use high-level language called NED1 (NEtwork Description language). The OMNeT++ for WSNs simulation has an extension which called Castalia [17], [19]- [22] providing also a component to study the Intrusion Detection phenomenon. We have decided to use the Mannasim extension of NS-2 for the proposed simulation study because of its popularity and the support that provides the user community. Moreover, we are only focusing on the transport and network layer issues of the VSN design, where NS-2 is well suited. In addition to that, the video traffic generator being used for the VSN simulation was originally written for NS-2. So, it is easily portable to Mannasim wireless sensor network simulations.

III. MODELING THE VIDEO TRAFFIC AND VIDEO SENSOR NETWORK A. The Simulated VSN Application Scenario
We may assume the simulated network as a remote monitoring scenario such as a forest animal monitoring sensor network, a forest fire monitoring system or intrusion detection network used by military. Irrespective of the application, the main functionality of the network will be more or less the same. So here we explain the generic aspects of this network. In the following WSN/VSN scenario, there are thirty-six normal sensor nodes (having capabilities equal to that of a Mica2 mote), which generate temperature data and periodically forward it to nine Cluster Heads (CH) which disseminated in the area and behave as well as Camera Nodes (CN). These nodes will require higher capability than Mica2 Motes. So, we set the parameters of these nodes to behave as normal MANET nodes. The Cluster Heads (CH) nodes will aggregate the simple scalar data transmitted periodically by Mica2 Motes and forward it to the data sink node, through the two access point nodes. The Camera Nodes (CN) will start sending randomly five seconds of video to simulate an intrusion detection event and forward it to the requesting sink nodes through other CH/Video sensor nodes. The video data from any CH/CN can be made available to any of the five video sink (VS) nodes. In this scenario, we can increase the simultaneous video connections between the visual sensor nodes and the video sinks from 1 to 5 and study how the "Normal sensor DATA" is getting effected by the video traffic.

B. The Simulation of Video Traffic
Based on the traffic model proposed in [3], the packet generation sequence is controlled, as shown in the following diagram. The output of the computational steps are performed as follows: 1. Transmission of a video packet when video traffic starts. 2. Compute the transmission time of the next video packet (Next Video Packet Time) NVPT=lognormal(x 1 ,y 1 ) 3. Computation of time t i needed in order the next video packet to be transmitted t i= lognormal(x 2 ,y 2 ) 4. Different parameters are used to calculate inter-video packet time (variable NVPT), in order to take in consideration different packet sizes, and the inter-control packet time (array t i ). The values of t 1 to t n are summed to variable SumNVPT (SumNVPT = t 1+ t 2+ t 3+ t 4+ t 5+……+ t n ). 5. While the value of SumNVPT is less than NVPT, t i is used as the inter-packet time for transmission of small packets. In any other case, the inter-packet time is NVPT-(SumNVPT -t i ). The size of control packets SC is assuming as 130 bytes and the size of video packets SV is 8015 bytes. Moreover, in this scenario the log-normal distribution with the parameters x1 equals 1.5514 and y1 equals -3.7143 and is used to simulate the inter-video packet time interval for big packets while the log-normal distribution with parameters x2 equals to 2.5647 and y2 equals to -9.1022 and is used to simulate the interpacket time for small packets. Looking the traffic pattern of the original and the simulated video traffic from work of Peng Gao et all [28], it is clear that the simulated traffic model almost resembles the original traffic pattern. Moreover, in that work [28], the traffic generator has been tested on a high bandwidth wired network simulation model. In this approach, the video traffic generator on wireless video sensor network with some lower capability wireless nodes is being tested.

C. About the Mica2 Motes [16] Sensor Node's Data Generation
The Mannasim [18], [19] has a simulation mechanism for simple scalar data. In the following table (Table 2) there is a brief description of the parameters that used from Mannasim in order the Mica2 Motes to be simulated. The Mannasim Framework provides temperature and carbon monoxide data generators. In this simulation the image processing and storage aspects of the visual sensor nodes were not taken into account since we only focus on the transport layer and network layer issues of the VSN design.

B. Metrics Used for Evaluation
The impact of increasing the number of simultaneous video connections on normal sensor data flow as well as the video traffic flow to the video sink nodes is studied in this work. Therefore, the performance/overhead of Mica2 Motes and visual sensor nodes are measured. The event trace file of a Mannasim simulation will be entirely different from normal NS-2 event trace file. This is because, not all the data packets generated at Mica2 Motes are actually sent to the data sink node, under Mannasim. Because, the data aggregated by the cluster head node will actually sent to the data sink. In fact, the packets sequence numbers generated at the Mica2 motes are entirely different from those arriving at the data sink. Because, the cluster head nodes will aggregate the data and generate new packets with different sequence numbers. The metrics applied to perform the impact analysis are summarized in the following: 1) Data Packets Received The number of data packets received at sink node is considered as an important metric to study the impact of video traffic on normal sensor data flows.
2) Normalized Routing Load Normalized routing load is the ratio between the number of routing packets generated and the number of data packets successfully delivered. Normalized Routing Load at Mica2 Motes is measured as the ratio between the number of routing packets generated and data packets successfully delivered at data sink. Normalized Routing Load at Visual Sensors is measured as the ratio between the number of routing packets generated and video packets successfully delivered at video sink.

3) Routing Message Overhead
In this work we measure Overhead in terms of number of generated/forwarded routing messages. We measure them separately at Mica2 Motes as well as VS-Nodes.

4) Dropped Packets at Mica2 Motes
The number of Dropped packets in a WSN/VSN scenario is an important metric to evaluate the network performance. Generally, the packets will be dropped by several reasons in a wireless ad hoc network scenario. We count the Packets Dropped at Mica2 Motes and VS-Nodes in order to understand the dropping behavior at two different levels.

5) Throughput of Scalar Data Flow
Normalized throughput is a ratio between the number of data packets successfully delivered and the duration of traffic. It is generally measured in Kbps/Mbps. We measure throughput of video flow as the ratio between the number of video packets successfully delivered and the duration of overall video traffic. We measure throughput of data flow at Mica2 Motes as the ratio between the number of scalar data packets successfully delivered and the duration of overall data traffic.

6) MAC Load
The MAC load is the ratio between the number of packets sent at MAC layer and the number of data packets successfully delivered.
The MAC Load at Mica2 Motes is measured as the ratio between the packets sent at MAC layer of Mica Motes and data packets successfully received at data sink.
The MAC Load at VS-Nodes is measured as the ratio between the packets sent at MAC layer of Visual Sensor nodes and video packets successfully received at video sink.

7) Average Consumed Energy
The average consumed energy is the average of energy spent by all the nodes in the network and is measured in Joules. In this analysis we separately calculate Average Consumed Energy by Mica2 Motes as well as VS-Nodes to understand the energy consumption behaviors of the two types of nodes.

8) Packet Delivery Fraction (PDF)
The PDF is the ratio between the number of generated data at source and the number of data packets successfully received at sink. The PDF of data flow is not considered because the data generated by the Mica2 Motes sources will actually not reach the destination. So, only the PDF of Video flows are considered. This is measured as the ratio between the number of generated video packets at visual sensor nodes and the number of video packets successfully received at video sink.

9) Average End to End Delay (EED) of Video Packets (ms)
To measure the End to End Delay, the time in which the packet is generated and the time in which the same packet arrived at the sink were needed. The sequence number of the packet is used to identify each packet during this calculation. Since the sequence number of sent packets at Mica2 Motes and received packets at data sink will be different in a Mannasim simulation, we didn't consider finding EED of normal sensor data flows (Mannasim directly display the EED of each data packet during the simulation on the console itself). So, only the End to End Delay of video traffic is measured herein. The EED of Video Packets is the time in which the video packet is generated at source and the time in which the same packet is arrived at the video sink.

C. The Results
The following table shows the measurements made on the Visual Sensor Network's Video flows and normal data flows. The values shown in the table are the average of three runs of the simulation. The first table shows the results measured for analyzing the flows between the Mica2 Motes and the data sink node. The second table shows the results measured for analyzing the flows between the Mica2 Motes and the data sink node.    In the following two line graphs, we compare the dropped packets with respect to the increase of number of simultaneous video connections. As shown in the following graphs (Fig. 2a, 2b)  The following two line graphs (Fig.s 3a, 3b) shows the overhead with respect to the increase of number of simultaneous video connections. Here we measure overhead in terms of total number of routing messages generated and forwarded from the mica motes and video sensor nodes. As shown in the line graph, the overhead at mica motes (denoted as WSN) is getting affected very much with respect to the increase in number of simultaneous video connections while using TCP for video flows. The following two line graphs (Fig. 4a, 4b) shows the energy consumption with respect to the increase of number of simultaneous video connections. Here we measure Energy consumption in Joules. As shown in the line graph, the energy consumption of mica motes as well as the visual sensor nodes are increasing with respect to the increase in number of simultaneous video connections. Since the Visual Sensor nodes process much data, obviously, it is consuming much energy than the mica motes. But, while using TCP for video traffic, then it reduces the energy consumption very much. The following two line graphs (Fig. 5a, 5b) shows the MAC Load with respect to the increase of number of simultaneous video connections MAC load is increasing with respect with respect to the increase in number of simultaneous video connections.  The following two line graphs (Fig. 6a, 6b) shows the routing load with respect to the increase of number of simultaneous video connections routing load is increasing with respect with respect to the increase in number of simultaneous video connections. But, the routing load at visual sensor nodes was minimum while using TCP for video transport. But the routing load at the mica motes was high while using TCP for video transport The following graph (Fig. 7) shows the Data Packets Received at Sink at different video load on the network. The increase in number of simultaneous video connection obviously affects the number of scalar data packets reaching the data sink which were sent from the mica motes. But, while using TCP for video, then it improves the performance a little bit in some cases. Basically, the end to end delay of data packets in a typical sensor network will be high because of the delays involved in data dissemination and aggregation. The following graph (Fig. 8) shows the average end to end delay of video packets at different video load on the network. The increase in number of simultaneous video connection obviously considerably increase the end to end delay while using UDP transport for video. But while using TCP for video transport the average end to end delay was much better and almost at same level even for different number of simultaneous video connections. Because of the high end to end delay .is a big obstacle in using this network for real-time monitoring application. The following graph (Fig. 9) shows the average packet delivery fraction/ratio of video packets at different video load on the network. The increase in number of simultaneous video connection rapidly decrease the pdf. But while using, TCP for video transport, then the pdf was comparatively good. The low pdf makes impossible to use this network for sending high quality graphics and using conventional encoding schemes. To overcome this problem, the video coding schemes should be designed in such a way to with stand high packet loss. The following two graphs (Fig. 10a, 10b) shows the throughput of video traffic as well as the normal sensor data flow at different video load on the network. As shown in the first graph (Fig. 10a), the throughput was increasing with respect to the increase of connections. But, while using TCP, for video transport then the throughput is increasing for data flow and decreasing for the video flow.

E. Problems Identified and Suggestion for Improvements
In this VSN network, the normal Mica2 Motes and the visual sensor nodes uses same communication channel since the same visual sensor node will also act as the cluster head for aggregating the scalar sensor data sent by the mica motes. Since we use the same channel for transmitting video packets from the visual sensor nodes, it is interfering with the communication of mica motes and affects their performance. Similarly, there are lot of simple Mica2 Motes sensor nodes in the network, and they will try to send packets periodically and affect the video communication of visual sensor nodes. If it is possible to use different communication channels for the two different flows (data and video) then it will improve the performance considerably. The low pdf, high packet dropping ratio and low throughput makes it hard to use this simple network for sending high quality graphics and using conventional encoding schemes. It shows that only low resolution and low rate video is possible under this lower capability VSN. At the sink, it will not be possible to reconstruct a meaningful video sequence from the received frames/packets which are encode in conventional encoding schemes. To improve this, we may design a custom, low rate video encoding scheme which should withstand against high packet loss and low throughput.
In this experiment, we used the IEEE 802.11 as the MAC layer protocol. If we design the visual sensor nodes as much powerful nodes that can handle high bandwidth using more advance MAC layer protocol, then we can improve the overall performance. Further, the Mica Motes of Mannasim also uses AODV, even though they generally will not forward packets more than one hop. This happens due to the fact that the scalar sensor data will be forwarded to data sink through application level forwarding mechanism only using one hop broadcast to a specific application port. It may also be a reason for the poor performance. We may avoid this in future design by just using a one hop broadcast without any routing agent to improve the performance.
In Visual sensor nodes, we used AODV as routing protocol since the video flows will need multi hop routing to reach remote video sinks. But, if we assume the VSN scenario as a static network of nodes without any mobility, then the periodic broadcast in AODV produce also an unnecessary overhead. Because, if nodes are not moving, then there will not be any necessity for very frequent route discovery. So we may modify the route discovery mechanism of AODV and suit it for the static visual sensor network. This will also improve the preference to a considerable level. The use of TCP protocol for video transport improved the performance in most of the cases. V. CONCLUSION Adding visual sensor nodes or camera nodes in an existing sensor network definitely affects the overall performance of the sensor network. The results from the previous section clearly proves the performance reduction. Increasing the simultaneous video flows just affects all the aspects of the network. Moreover, the outcome of this research justify that the hardware meant for normal sensor network will only support simple scalar data flows. The IEEE 802_11 MAC protocol and the common UDP transport protocol which are generally used in sensor network are also not sufficient to handle simultaneous video flows of a VSN. TCP performed better than UDP while handling video traffic. The simple network such as the one presented in this paper can only support visual sensor data such as low quality images. Using them for sending video with its present hardware capability is not possible. The communication overhead should be reduced to improve the performance of both data flows and video flows. For that we may consider using much powerful visual sensor nodes with advance transport layer, network layer, and physical layer protocols. In the previous section we highlighted some of the problems identified during the simulation and provided some suggestion for improvements in video encoding and routing protocols. That also can be considered in future design of VSN protocols. In this work the issues of the typical visual sensor nodes such as image processing, storage and camera calibration were not taken in to account since NS-2/Mannasim were not at all having facilities to incorporate those. Further we only concentrate on the transport layer and network layer issues of the VSN design. So future works may incorporate missing issues of VSN also in their network/simulation design.