Processing math: 100%

Wednesday, January 28, 2015

Step 3.2: Implementation - Camera Coverage Quality Metric(CCQM)

To this extent we have devised a method to
  1. Simulate the possible trajectories in the scenario.
  2. Identify regions on the floor with high human occupancy.
This marks the end of the first part. 

In the second part, we would like to use this information generated in the first part to optimize the camera network. To recap, the optimization would involve the maximization of
  • area of view coverage (A),
  • view locations with human activity volume (H), and
  • frontal view of humans (F).
  • resolution of the obtained image(R)
The idea is to define a metric that quantifies the quality of the location and orientation of a camera based on the above three parameters. Let the location and orientation of the camera be defined by the parameters \Omega, which we will define later. The metric Camera Coverage Quality Metric (CCQM) is a function of \{A, H, F,R\}. Hence,
\begin{equation} CCQM(\Omega) = g(A, H, F, R) = A(\Omega) * H (\Omega) * F(\Omega) * R(\Omega) \end{equation}
where, A is a function of \Omega quantifying the area in view, H is the function \Omega quantifying the human activity in view, F is a function of \Omega quantifying the possible frontal images that can be obtained from the view and R is a function of \Omega  quantifying the possible resolution that can be obtained from the view. Given this metric, the problem is to find the camera parameters \Omega that maximizes the above metric. if \Omega^* is the optimum parameters then.
\begin{equation} \Omega^* = arg_{\Omega} max\{CCQM(\Omega)\} \end{equation}
 The clusters alone provides adequate information to maximize the parameters \{A, H\}, the record of simulated trajectories can be used for maximizing the parameter \{F, R\}. Given the the parameter \Omega lets define the functions \{A, H, F, R\}. Assuming that the floor is represented by a triangular mesh containing triangles \{t_1, t_2, ..., t_n\} with centroids \{c_1, c_2, ..., c_n\}. Given the configuration \Omega let \{t^{\Omega}_1, t^{\Omega}_2, ..., t^{\Omega}_m\} be all the triangles in view of the camera.

Area of View (A):

Then the area function A(\Omega) is define as
\begin{equation} A(\Omega) = \frac {area\_in\_view}{total\_area\_of\_floor}  = \frac {\sum\limits_{i=1}^m area(t^{\Omega}_i)}{\sum\limits_{i=1}^n area(t_i)} \end{equation}

Human Activity Volume (H):

The human occupancy map is calculated from the simulated trajectories and an occupancy value is assigned to every triangle in the floor mesh as described in the previous blog. Let O(t) the occupancy of the traingle t. Then the function H(\Omega) is defined as
\begin{equation} H(\Omega)  = \frac{1}{C}{\sum\limits_{i=1}^m O(t^{\Omega}_i)} \end{equation}
where C is a normalizing constant.

Frontal View of Humans (F):

To quantify the probable amount of frontal view of humans for the the configuration \Omega, we make use of the simulated trajectories. For every triangle t_i in the floor mesh, direction discretization is performed and eight direction vectors \{v^i_1, v^i_2, ..., v^i_8\} are defined as described by Zhou et al. in [1] Figure 1
Figure 1
Figure 2

In the following step a vector transition histogram/matrix (Figure 2) is constructed based on the simulated trajectories. For every simulated trajectory, the consecutive points in the trajectory are considered to create direction vectors. Let T be a simulated trajectory of length l, T = \{p_1, p_2, ..., p_l\}.  For all sets of consecutive points \{p_{i-1}, p_i\} in the trajectory T, the trajectory's local direction vector is defined as (p_i - p_{i-1}), and the bin corresponding to the triangle t in which the point p_{i-1} is located and the descretized direction vector closest to the direction of (p_i - p_{i-1}) is updated by 1. Let \Psi(t,v) be the histogram function, then the function F(\Omega) is defined as

\begin{equation} F(\Omega) = \frac{1}{m}(((c-p_{i-1})\cdot v_{k-1})\Psi(t_j, v_{k-1}) + ((c-p_{i-1})\cdot v_{k})\Psi(t_j, v_{k}) +((c-p_{i-1})\cdot v_{k+1})\Psi(t_j, v_{k+1}) ) \end{equation}
, where t_j is the triangle in the floor mesh, the point p_{i-1} lies in and v_k is the direction vector closest to (p_i - p_{i-1}).
\begin{equation} k = \arg \max_k v_k \cdot (p_i - p_{i-1}) \end{equation}

Resolution of the Image(R)

This component of CCQM quantifies the resolution of the images. If the obtained images is far from the camera, the obtained resolution is very low and the image might not add any value to the system. This component is application dependent, it could be customized to obtain a sufficient resolution of any object, which could be just the face or the entire body of a human. We follow the methodology described by Janoos et al. in [2]. The function R(\Omega) is defined as
\begin{equation} R(\Omega) = \frac{1}{m} \sum \limits_{i=1}^{m} \frac{\rho^\Omega (t_i)}{\rho_{min}} \end{equation}

Let C be the center of the camera, then
\begin{equation} \rho^\Omega = \frac{\sigma_{k-1}(c-p_{i-1})\cdot v_{k-1}+\sigma_{k}(c-p_{i-1})\cdot v_{k}+\sigma_{k+1}(c-p_{i-1})\cdot v_{k+1}}{2\pi * d(C,p_i-1)^2(1-\cos(\gamma/2) )} \end{equation}

where
\gamma is the Y-field of view defined for the camera
d(p_1,p_2) is the Euclidean distance between the points p_1 and p_2
\sigma is the number pixels that the object occupies on the image
and \rho_{min} is the used defined value that defines a minimum required resolution of an object in pixels/inch

To calculate the number of pixels \sigma, the bounding box of the object is considered and perspective transformation is applied to the corners to find their location in the image. The area of the quadrilateral is used as the \sigma value. if (a, b, c, d) are the location of the corners in the image. Then
\begin{equation} \sigma = \frac{1}{2}||ac \times bd|| \end{equation}

Now that we have a way to quantify the configuration of a camera using CCQM, in the next step we consider a heuristic optimization algorithm to maximize this quantity to obtain the optimum configuration.

[1] Wenjun Zhou; Hui Xiong; Yong Ge; Yu, J.; Ozdemir, H.; Lee, K.C., "Direction clustering for characterizing movement patterns," Information Reuse and Integration (IRI), 2010 IEEE International Conference on , vol., no., pp.165,170, 4-6 Aug. 2010
[2] F. Janoos, R. Machiraju, R. Parent, J. W. Davis, and A. Murray, "Sensor con guration for coverage optimization for surveillance applications," 2007.

No comments:

Post a Comment