Progress 09/01/22 to 08/31/23
Outputs Target Audience:
Nothing Reported
Changes/Problems:
Nothing Reported
What opportunities for training and professional development has the project provided?
Nothing Reported
How have the results been disseminated to communities of interest?
Nothing Reported
What do you plan to do during the next reporting period to accomplish the goals?
Nothing Reported
Impacts What was accomplished under these goals?
Video and Image Annotation Tool: We implemented several improvements and addressed bug fixes in the image and video annotation tool during this reporting period. To speed up the video annotation process, we added a feature to interpolate bounding boxes between two annotated frames. The software generates the positions of the chicken in between any two annotated frames thus generating eight additional groundtruth boxes. We also fixed some bugs in the software components that perform splitting/merging of tracks that was designed to remove ambiguities in the groundtruth data. Multiple new activity tags/labels to the annotation software were also added. Analysis of image-plane vs ground-plane calibrated feature extraction: In the previous reporting period, we developed computer vision algorithms to ortho-rectify video imagery to remove perspective distortion in the images. Our goal was to minimize the discrepancies in the magnitude of motion features for chickens at varying distances from the sensor by projecting the entire scene on a an orthorectified plane. However, the chicken that were far away from the camera appeared too blurred after ortho-rectification and the estimated features became unreliable. We did not observe any notable improvements in the feature extraction components either. Therefore, only undistorted imagery was used in our experiments. Detection of hens in video imagery with ground plane calibration: The Novateur team completed the integration of ground plane calibration with the detection module. The ground plan calibration enables the software to estimate the dimensions of the detected objects (hens). This information is used to filter the output of the detector module by taking into account, the expected real-world sizes of detected objects in the video. Hen Tracking: In order to track hens in video, we incorporated a state-of-the-art model known as Sparse Graph Tracker (SGT) [HKW+23], which uses a combination of Centernet [ZWK19] for object detection and a graph neural network to track objects over time. This is an improvement over the previous method, which builds off a region proposal network to perform the detection. Once the chickens are detected, the per-frame boxes are unified into trajectories for each unique chicken via a hyper-graph matching algorithm. The older method was trained on top-down footage of a small room with ten chickens. However, the newer SGT model operates on footage of a full warehouse of chickens from an isometric point of view, with far more objects to keep track of at once. The SGT is able to scale better because it leverages the parallel computing power offered by the GPU and because the graph neural network can learn to track objects more intelligently and consistently, even with high object density and with occlusion. Motion Modeling and Activity Detection: In order to train the system for modeling hen movements to predict the type of activity, video chips of individual hens were extracted using the video annotation tool. These video chips were hand-labeled according to what action they were performing for the purpose of training action-recognition models. We used a deep learning model that involves a custom architecture using a CNN to extract features from a window of 10 input images and pass them as a sequence to a dilated temporal convolutional network (TCN), which outputs a Softmax score over the possible classes. These classes include pecking, preening, standing, threat, and walking. To test the prototype system, we cleaned up tracking errors and ensured that only clean data was fed into the training/testing pipeline. From here, a time-series dataset of action label per frame is extracted over all the frames of a hen. In our experiments, the final trained model was able to obtain an average precision of 0.99, average recall of 0.97, and an average F1-score of 0.98. Note that the tracking data used for training/testing was hand-labeled i.e. no tracking errors existed. When automated tracking results are used, however, we expect that these figures will degrade. This is because automated tracking results often suffer from incomplete tracks and switching of ID between different objects. Completed processing of video datasets: The Novateur team completed processing the second set of data set that was provided by our university partner in collaboration with an active poultry farm. The new dataset was characterized as either "infected" or "normal". The new dataset was combined with the previous dataset and both training and testing for detection and activity models were performed on the collective sets of videos. Visualization of Poultry Welfare Analytics: We completed software modules to visualize the output of the analytics software. These include visualization of the video, the status of the chicken flock and the historic movement data. Software: We completed various Python software scripts to process various data streams. These include: Loading groundtruth data and images and train the detector. Loading groundtruth data and images to train SGT tracker. Loading action tags and video to train deep learning modes for action prediction. Process video and perform detection, tracking, and action prediction. Conclusion Our study shows that modeling hen behavior and detecting their activity can be used as an indicator of their health. This indicator can be used to predict any potential outbreaks and prompt actions can be taken to address any issue before they become too extreme. The studies performed in the Phase II can be extended as follows: Use of multiple video sensors: Our studies included only one video camera for each pen. However, with multiple video cameras, the system can analyze a larger space and produce more reliable analytics. Camera view angle: During our meetings with poultry farm owners for setting up video recording system, we decided to use their existing video sensors to take advantage of the hardware that was already installed. Therefore, our analysis was restricted by the view angles set up by their technicians. Future studies can investigate the advantages and disadvantages of varying camera view angles. Dealing with dusty environments: Cages in commercial poultry farms are often dusty which could obstruct the camera views over time. For uninterrupted long-term analysis, an automated mechanism to wipe the camera enclosure can be deployed. Customer education and adaptation: In our discussion with poultry farm operators, we concluded that while the industry may be open to incorporating automated poultry health monitoring systems, a wide-range adoption of the technology for this purpose will first require educating the stakeholders about its benefits and provide the technical support they will need to integrate the system within their infrastructure. The benefits this system can provide include both financial savings as well as humane treatment of the livestock. By early intervention, more livestock can be saved which can continue to produce eggs without interruptions. On the other hand, if the intervention is not made early on, the livestock that is too sick to save must be killed resulting in financial loss. Additionally, many early adopters of video-based technologies are concerned about video recordings that can be used against them in adverse situations. The stakeholders can be assured that the video in this system is processed in an online fashion and does not need to be recorded. The extracted motion features and other analytical results can be stored in a database that are useful for historical analysis, however, they do not include any identifiable information, such as raw image data. References: [HKW+23] "Detection recovery in online multi-object tracking with sparse graph tracker", J. Hyun et. al, Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2023. [ZWK19] "Objects as Points", X. Zhou et. al, arXiv preprint arXiv:1904.07850, 2019
Publications
|
Progress 09/01/19 to 08/31/23
Outputs Target Audience:We reached out to poultry farms for feedback regarding operational requirements, market needs, and commercialization potential of the proposed technologies. The feedback we have received was very positive. Most poultry farms are moving towards cage-free systems, and the customers we have talked to recognize that a better understanding is needed of laying hen behavior and well-being in cage-free facilities. We believe that while the industry is open to incorporating automated poultry health monitoring systems, a wide-range adoption of the technology for this purpose will require educating the stakeholders about its benefits and provide the technical support they will need to integrate the system within their infrastructure. Changes/Problems:
Nothing Reported
What opportunities for training and professional development has the project provided?
Nothing Reported
How have the results been disseminated to communities of interest?
Nothing Reported
What do you plan to do during the next reporting period to accomplish the goals?
Nothing Reported
Impacts What was accomplished under these goals?
Data Collection: The team collected large amounts of video data from two different settings. The first set up was in a controlled environment at the Purdue Animal Sciences, Research and Teaching Center. A flock of 16-week old laying hen housed in four cage-free rooms were randomly assigned to one of two groups; a control group and a treatment group that was infested with Northern Fowl mites at 20 weeks of age. The second dataset was obtained in collaboration with a commercial egg producing farm with and without infection. Birds in all barns were 66 weeks of age when data were collected. Video and Image Annotation Tool: Novateur designed a video and image annotation tool. The tool enables an operator to load a video or sequence of images and draw bounding boxes around hens and assign one or more activity class labels to the hen (pecking, preening, standing, threat, and walking). The groundtruth data is exported in xml format. Video Groundtruth Generation: A large number of hens were manually annotated using the video tool. Video recordings were manually analyzed and annotated to examine changes in hens' activity levels and space use to evaluate the progression of the mite infestation. This data was then used to facilitate development of proposed automated visual monitoring and behavior analysis algorithms as well as validation of the technologies. Detection of hens in video imagery with ground plane calibration: We trained deep-learning models called Centernet [ZWK19] for detecting hens in video. We integrated ground plane calibration with the detection module which enables the software to estimate the dimensions of the detected objects (hens). This information is used to filter the output of the detector module by taking into account, the expected real-world sizes of detected objects. Hen Tracking: To track hens in video, we incorporated Sparse Graph Tracker (SGT) [HKW+23], which uses a combination of Centernet for object detection and a graph neural network to track objects over time. Once the hens are detected, the per-frame boxes are unified into trajectories for each unique hen via a hyper-graph matching algorithm. Development of Automated Behavioral Analysis and Prediction Technologies: We developed machine learning algorithms for detection of hen behavior. The automated behavior detection modules ingest motion features of birds extracted through detection and tracking and then classifies them into behaviors that are indicative of hen's health. Collectively, these signals can predict the overall state of the hen population, both at the micro-level (individuals) and at the macro-level (crowd as a whole). We define mid-level kinematic features as those that define local motion of each bird within its bounding box. The high-level behavioral features seek to identify animal behavior such as wing-flapping, pecking, and preening which cannot be obtained directly from low-level motion or trajectory information. We use the Lucas-Kanade optical flow algorithm to compute the motion of the individual pixels within the image. We then compute statistical moments (mean, variance, skew, kurtosis) of the flow field to obtain macro-signatures which give indications of the overall motion in the scene. We also use the bounding box outputs from our tracker to crop the flow field to the regions corresponding to hens. This gives a local per-frame motion descriptor for each detected object to compute statistical moments. These individual descriptors can be kept separate or can be merged for global analysis. We track the evolution of these behavioral markers over time to identify peaks in activity (anomaly) which may correspond to events of interest. The motion vectors in image can be rectified to correct for the perspective distortion in the camera setup, in order to make activities at different depths comparable. We developed a simple method for carrying out this rectification by exploiting known geometric structures in the poultry facility (such as parallel lines and vanishing points). We trained models to predict what action a hen was performing in each frame. To test the prototype system, we cleaned up tracking errors such that only clean data was use for training/testing. A time-series dataset of action label per frame is extracted over all the frames. The final trained model was able to obtain an average precision of 0.99, average recall of 0.97, and an average F1-score of 0.98. When automated tracking results are used, however, we expect that these figures will drop. This is because automated tracking results often suffer from incomplete tracks and switching of ID between different objects. Visualization of Poultry Welfare Analytics: We developed software modules to visualize the output of the analytics software. These include monitoring video, status of the flock and the historic movement data. Software: Various Python software scripts to process various data streams, including: Loading groundtruth data and images to train the detector and tracker algorithms. Loading action tags and video to train deep learning models for action prediction. Processing video for detection, tracking, and action prediction. Evaluation of system components. Conclusion: Our study shows that modeling hen behavior and detecting their activity can be used as an indicator of their health. This indicator can be used to predict any potential outbreaks and prompt actions can be taken to address any issue before they become too extreme. Future extension of Phase II work may include: Multiple video feeds: Our studies included only one video camera for each pen. However, with multiple video cameras, the system can analyze a larger space and produce more reliable analytics. Camera view angle: We used existing video sensors at the poultry farm to take advantage of the hardware. Therefore, our analysis was restricted by the view angles set up by their technicians. Future studies can investigate pros/cons of different camera angles. A downward looking camera my provide a larger coverage and less perspective distortion, however, it may not be able to compute motion features accurately for action prediction due to a different view angle. Dealing with dusty environments: Cages in commercial poultry farms are often dusty which could obstruct the camera views over time. An automated mechanism to wipe the camera enclosure can be deployed on a timer to periodically wash/wipe down the camera enclosure window. Customer education and adaptation: In our discussion with poultry farm operators, we concluded that while the industry is open to incorporating automated poultry health monitoring systems, a wide-range adoption of the technology for this purpose will require educating the stakeholders about its benefits and provide the technical support they will need to integrate the system within their infrastructure. The benefits this system can provide include both financial savings as well as humane treatment of the livestock. Additionally, many early adopters of video-based technologies are concerned about video recordings that can be used against them in adverse situations. The stakeholders can be assured that the video in this system is processed in an online fashion and does not need to be recorded. The extracted motion features and other analytical results can be stored in a database that are useful for historical analysis, however, they do not include any identifiable information, such as raw image data. Addressing these concerns early on will pave the way for wide-range adoption of the technology in the poultry industry. References: [HKW+23] "Detection recovery in online multi-object tracking with sparse graph tracker", J. Hyun et. al, Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, 2023. [ZWK19] "Objects as Points", X. Zhou et. al. arXiv preprint arXiv:1904.07850, 2019
Publications
|
Progress 09/01/21 to 08/31/22
Outputs Target Audience:We have reached out to poultry farms for feedback regarding operational requirements, market needs, and commercialization potential of the proposed technologies. The feedback we have received is very positive. Most poultry farms are moving towards cage-free systems, and the customers we have talked to recognize that a better understanding is needed of laying hen behavior and well-being in cage-free facilities. Changes/Problems:
Nothing Reported
What opportunities for training and professional development has the project provided?
Nothing Reported
How have the results been disseminated to communities of interest?
Nothing Reported
What do you plan to do during the next reporting period to accomplish the goals?Improving High-Level Behavior Modeling: We plan to further improve our technologies for behavioral modeling during the remainder of the contract. We will incorporate recent advancements in the deep learning approaches for modeling time-series data. For example, we hope to better classifying certain hen behaviors such as pecking and preening. We also plan to refine the inference methodology to utilize the behavior output of the classifier as a predictor for the presence of parasitic infestation at the flock-level. System Testing: We will continue to test the individual components of the system as well as the entire end-to-end system during the remainder of this project. This will include long-term consistent video detection and tracking of flocks, identification of any memory leaks, and software robustness against video feed failure and communication drops.
Impacts What was accomplished under these goals?
Ground-plane calibration to minimize perspective distortion artifacts: Due to the perspective distortion, the motion features and its magnitude for objects appear differently based on their relative distance from the camera. To address this problem, the Novateur team develop computer vision algorithms to rectify video imagery to remove perspective distortion in the images. We implemented image processing algorithms to detect railings and other components in the pen and applied image warping techniques for ground plan rectification. This approach generates motion features of chicken movements that are consistent across the field-of-view of the camera and enables generation of motion models of normal/abnormal behavior, independent of the camera view angles. Therefore, models generated in one scenario can be deployed to other settings for observing poultry behavior. Annotation Tool Improvements: During this reporting period, we made sever additional improvements to the video annotation tool as well as performed a few bug fixes. We have added the functionality to split/merge tracks to remove ambiguities in the groundtruth data. We also added a secondary activity tag to the tracks to allow multiple labels. Processing of new datasets: The Novateur team received additional poultry farm datasets from our university partner, Purdue Animal Sciences, comprising of clean as well as infected flocks. The researchers at the university provided insight about the behavior of the birds. With their input, Novateur generated the groundtruth data for training motion models to represent normal and abnormal behavior when the hens are sick. We processed the video data to extract motion features of birds in each video and generated motion models capturing the behavior of birds in these scenarios. Novateur combined the new dataset with the existing dataset collected in the beginning of the project for training purposes and we expect that collective information from both datasets will further improve the overall accuracy of the system. Improved detection of hens in video imagery: The Novateur team further improved the detection algorithm to improve software's accuracy. Using the ground-plane calibration steps described above, we were able to reduce some of the detection errors by taking into account, the expected real-world sizes of detected objects in the video. For example, we can remove detection that are too small to be a true detection. In addition, we can split detections that are too large to be one hen and may comprise of multiple hens within a single detection bounding box. Continued Improvements in Motion Modeling and Anomaly Detection Methods: We improved our anomaly detection algorithm by automating the threshold selection. Motion anomalies were cause poor reconstruction of the motion field and detected by applying a threshold to the reconstruction error. We utilized the training data with clean and sick flock to determine this threshold automatically. Interactive Dashboard for Visualizing Live Poultry Welfare Analytics: We made several improvements in the analytics dashboard with the ability to monitor current status of the chicken flock as well as the historic data.
Publications
|
Progress 09/01/20 to 08/31/21
Outputs Target Audience:We have reached out to poultry farms for feedback regarding operational requirements, market needs, andcommercialization potential of the proposed technologies. The feedback we have received is very positive. Most poultryfarms are moving towards cage-free systems, and the customers we have talked to recognize that a better understanding isneeded of laying hen behavior and well-being in cage-free facilities. Changes/Problems:
Nothing Reported
What opportunities for training and professional development has the project provided?
Nothing Reported
How have the results been disseminated to communities of interest?
Nothing Reported
What do you plan to do during the next reporting period to accomplish the goals?Improved Robustness in Tracking: In the remaining part of Phase II, we plan to continue making improvements to our tracking system to increase its robustness by modeling the data association as an optimization problem using both appearance and spatial features. The Novateur Team has experience developing state-of-the-art multi-target tracking algorithms which utilize graph representations to handle occlusions and object interactions natively by solving a graph matching optimization problem via network flow formulation. In fact, we implemented a version of such a tracker during the Phase I of this contract and demonstrated its effectiveness in tracking hens in small-scale overhead imagery. We plan to incorporate our efficient and accurate in-house tracking algorithms into our latest poultry welfare analysis system in the second half of this reporting period. High-Level Behavior Modeling: We hope to improve our technologies for behavioral feature extraction during the remainder of the contract. We plan to build on our work in this area, which consisted of training a deep neural network (based on LSTM) for classifying behaviors such as pecking and preening. We tested this classifier on several thousand manually selected examples of preening and non-preening, showing promising accuracy. This research work also produced a method which used the level of preening and non-preening behavior as a predictor for the presence of parasitic infestation at the flock-level, with early results suggesting its effectiveness. However, obtaining enough training data in diverse commercial poultry facilities to effectively train a deep learning model to recognize such behaviors may be difficult. System Integration and Testing: While the individual components of the software have been shown to work effectively, what remains is to integrate these modules into a robust, end-to-end software system for precision livestock farming (PLF). This will require standardization of the inputs and outputs to the various modules, as well as efficiency improvements to the algorithm implementations to ensure the proposed system can operate efficiently in time-constrained environments.
Impacts What was accomplished under these goals?
Annotation Tool Improvements: During this reporting period, we have made significant improvements to the annotation tool's front-end interface and backend storage schema. We have added the ability to annotate overlapping intervals of hen behavior - a feature which is necessary to capture multiple behaviors which occur at the same time (such as simultaneously sitting and being the recipient of feather-pecking activity). We have also improved visibility in videos with poor lighting by adding a gamma adjustment slider bar. Dataset from Poultry Farms: We have obtained additional data from Purdue Animal Sciences. Birds in all barns were 66 weeks of age when data were collected. Detection of individual hens in video imagery: The commercial poultry facility presents a number of challenges for off-the-shelf object detector. We developed an object detector which utilizes a form of one-shot learning from the first frame of the video. This method, based on siamese learning, is robust to noise and appearance changes and can be used to detect multiple objects, and does not require any target domain training data. The siamese object detection system uses multiple templates selected in the first frame to learn the typical appearance of a chicken in a high-dimensional feature space. The network then uses these template embeddings to search for similar-looking objects in the scene. We select objects at multiple scales to enable detection of objects appearing at different depths from the camera. We carry out a form of ensemble inference by cross-correlating the image features with several different templates. Using multiple templates significantly helps improve the variety of detected objects while incurring only a mild overhead, since the extra computation occurs at the final cross-correlation step. To decode the predictions, we find the argmax template for each spatial cell output and obtain the corresponding bounding box prediction and correlation score. After applying the predicted regression offsets from the RPN, the final bounding box predictions are subjected to non-max-suppression and a confidence threshold. Tracking of detected hens over time: After detection, the subsequent target tracking module solves a data association problem to link the detections in previous frames with those in the current frame. The result is a set of unique identifiers which determine the estimated trajectories of the individual chickens in the video over time. The tracker needs to be able to handle several challenges that arise in realistic, unconstrained multi-target tracking scenarios such as occlusion, spurious detections and false negatives, crowds of small, etc. Preliminary Mid-Level Kinematic Features and Anomaly Detection Methods: We have also developed methods for extracting motion information, behavioral indicators, and other metadata about hen movement. Collectively, these signals tell a compelling story about the overall state of the chicken population, both at the micro-level (individuals) and at the macro-level (crowd as a whole). We define mid-level kinematic features as those that define local motion of each bird within its bounding box. Conversely, high-level behavioral features seek to identify animal behavior such as wing-flapping, pecking, and preening which cannot be obtained directly from low-level motion or trajectory information. From the bounding-box trajectories of the individual hens obtained from our tracking system, we can obtain velocities and changes in size (due to wing-flapping, posture changes, etc.) at the micro-level. We use the Lucas-Kanade optical flow algorithm to compute the motion of the individual pixels within the image from one frame to the next. We then compute statistical moments (mean, variance, skew, kurtosis) of the flow field to obtain macro-signatures which give indications of the overall motion in the scene. Using tracker output, a local per-frame motion descriptor for each detected object is computed from which we can compute statistical moments. These individual descriptors can be kept separate or can be merged into a bag for global analysis. We track the evolution of these behavioral markers over time to identify peaks in activity which may correspond to events of interest. We use a simple peak detection algorithm for this form of anomaly detection. Improved Motion Modeling and Anomaly Detection Methods: Our improved motion description pipeline treats the optical flow grid as a vector field and seeks to approximate this field by fitting a linear combination of radial basis functions (RBFs) to the ground truth optical flow map. Once the linear regression (or ridge regression) model has been fitted, the coefficients for the current frame are fed to a sequence-to-sequence anomaly detection network. Motion anomalies are assumed to cause poor reconstruction, and can be obtained by applying a threshold to the reconstruction error. Interactive Dashboard for Visualizing Live Poultry Welfare Analytics: We have also developed a real-time analytics dashboard which shows the current status of the algorithms and indicates detected anomalies in the chicken flock. 5. Remaining Work for Phase II Improved Robustness in Tracking: In the remaining part of Phase II, we plan to continue making improvements to our tracking system to increase its robustness by modeling the data association as an optimization problem using both appearance and spatial features. The Novateur Team has experience developing state-of-the-art multi-target tracking algorithms which utilize graph representations to handle occlusions and object interactions natively by solving a graph matching optimization problem via network flow formulation. In fact, we implemented a version of such a tracker during the Phase I of this contract and demonstrated its effectiveness in tracking hens in small-scale overhead imagery. We plan to incorporate our efficient and accurate in-house tracking algorithms into our latest poultry welfare analysis system in the second half of this reporting period. High-Level Behavior Modeling: We hope to improve our technologies for behavioral feature extraction during the remainder of the contract. We plan to build on our work in this area, which consisted of training a deep neural network (based on LSTM) for classifying behaviors such as pecking and preening. We tested this classifier on several thousand manually selected examples of preening and non-preening, showing promising accuracy. This research work also produced a method which used the level of preening and non-preening behavior as a predictor for the presence of parasitic infestation at the flock-level, with early results suggesting its effectiveness. However, obtaining enough training data in diverse commercial poultry facilities to effectively train a deep learning model to recognize such behaviors may be difficult. System Integration and Testing: While the individual components of the software have been shown to work effectively, what remains is to integrate these modules into a robust, end-to-end software system for precision livestock farming (PLF). This will require standardization of the inputs and outputs to the various modules, as well as efficiency improvements to the algorithm implementations to ensure the proposed system can operate efficiently in time-constrained environments.
Publications
|
Progress 09/01/19 to 08/31/20
Outputs Target Audience:We have reached out to poultry farms for feedback regarding operational requirements, market needs, and commercialization potential of the proposed technologies. The feedback we have received is very positive. Most poultry farms are moving towards cage-free systems, and the customers we have talked to recognize that a better understanding is needed of laying hen behavior and well-being in cage-free facilities. For example, Rose Acre Farms, the second largest egg producer in U.S. showed great interest in automated technologies that will enable animals to be monitored in real-time without requiring additional labor. In particular, they were appreciative of the non-invasive monitoring and early response capabilities that Novateur's system can provide. Creighton Brothers, another farm with more than 2 million hens and number 30 in the US in egg production, showed interest in utilizing technology to promote hen well-being and to be able to monitor hen health and behavior in real-time. Changes/Problems:
Nothing Reported
What opportunities for training and professional development has the project provided?
Nothing Reported
How have the results been disseminated to communities of interest?
Nothing Reported
What do you plan to do during the next reporting period to accomplish the goals?In the latter half of this reporting period, we plan to make improvements to our tracking system to increase its robustness by modeling the data association as an optimization problem using both appearance and spatial features. The Novateur Team has experience developing state-of-the-art multi-target tracking algorithms which utilize graph representations to handle occlusions and object interactions natively by solving a graph matching optimization problem via network flow formulation. In fact, we implemented a version of such a tracker during the Phase I of this contract and demonstrated its effectiveness in tracking hens in small-scale overhead imagery. We plan to incorporate our efficient and accurate in-house tracking algorithms into our latest poultry welfare analysis system in the second half of this reporting period. During this half of the reporting period we have focused on computing accurate mid-level kinematic features at scale, while we hope to improve our technologies for behavioral feature extraction during the remainder of the contract. We plan to build on our Phase I work in this area, which consisted of training a deep neural network (based on LSTM) for classifying behaviors such as pecking and preening. We tested this classifier during Phase I on several thousand manually selected examples of preening and non-preening, showing promising accuracy. This research work also produced a method which used the level of preening and non-preening behavior as a predictor for the presence of parasitic infestation at the flock-level, with early results suggesting its effectiveness. While our algorithms are evaluated on real data from a commercial poultry facility, our neural network is trained on five generic large-scale video and image datasets (GOT-10K, ILSVRC-VID, YouTube-BB, ImageNet Detection, and Microsoft COCO) to ensure it can generalize well to novel categories and scenes. In the near months, we hope to use our in-house annotation tool to collect examples of animal behaviors such as preening, dustbathing, and pecking, in order to train machine learning models which can recognize such behaviors in individual chickens from video footage. While the individual components of the software have been shown to work effectively, what remains is to integrate these modules into a robust, end-to-end software system for precision livestock farming (PLF). This will require standardization of the inputs and outputs to the various modules, as well as efficiency improvements to the algorithm implementations to ensure the proposed system can operate efficiently in time-constrained environments.
Impacts What was accomplished under these goals?
During this reporting period, the Novateur Team has focused its efforts on scaling up the computer vision technologies developed in Phase I to work in larger industrial-size poultry facilities. These more realistic environments introduce a number of challenges which complicate the recognition and behavior modeling tasks, including perspective distortion, barrel distortion, occlusion, poor or uneven illumination, and out of focus effects. Furthermore, the commercial cage-free layer houses typically feature a much larger number of animals, creating a need for custom methods for detecting and tracking objects in crowded scenes. To accelerate the annotation process for data of this form, we developed a custom in-house video annotation tool based on our existing software for bounding box annotation and interpolation in video data. The tool, developed in C++ under the QT GUI framework, allows the user to quickly and easily locate objects by bounding box in the video frames, while also annotating the frame-by-frame behavior of each actor (such as preening, pecking, dustbathing) and associated metadata such as directionality of the interaction (for behaviors involving multiple actors) and behavior modifiers (such as sitting/standing). The bounding boxes are propagated forwards by interpolation between anchor boxes hand-located at keyframes. The tool supports loading both video and image-sequence data, and writes the annotations to a custom XML format which can easily be processed by the other modules of our software system. The commercial poultry facility presents a number of challenges which prevent the direct application of the object recognition methods developed in Phase I. In particular, training a deep learning model for each new scenario is not feasible, and models trained on limited data cannot be expected to generalize well given the difficulty of the recognition task. For this reason, and because it is time-consuming and expensive to obtain high-quality ground truth data for individual chickens, we developed an object detector which utilizes a form of one-shot learning from the first frame of the video. This method, based on siamese learning, is robust to noise and appearance changes and can be used to detect multiple objects, and does not require any target domain training data. The siamese object detection system uses multiple templates selected in the first frame to learn the typical appearance of a chicken in a high-dimensional feature space. The network then uses these template embeddings to search for similar-looking objects in the scene. We select objects at multiple scales to enable detection of objects appearing at different depths from the camera. The siamese network is trained for the single-object tracking task on a diverse, large-scale video dataset using nearby frames. The training set does not contain large numbers of examples of egg laying chickens, but the category-agnostic design means the algorithm can generalize across categories successfully as long as the domain gap is reasonable. Earlier in the reporting period, we developed a number of simpler object detection algorithms using classical computer vision techniques such as background subtraction and blob detection. Although these methods can run in real-time and make few assumptions about the target domain data, they have poor precision because of their sensitivity to noise. Furthermore, their recall suffers because they often fail to detect stationary or slowly-moving objects. The siamese object detection module predicts the locations and sizes of the chickens, in the form of a list of bounding boxes in the raw image frame, together with normalized confidence scores used to select an operating point on the precision-recall curve. The subsequent target tracking module solves a data association problem to link the detections in previous frames with those in the current frame. The result is a set of unique identifiers which determine the estimated trajectories of the individual chickens in the video over time. During this reporting period, we have developed a simple multi-target tracking algorithm using Kalman filter for state estimation and prediction, and the Hungarian matching algorithm using pixel-distance cost for solving the data association problem. We model the state as a 4-vector (x, y, w, h) with the position and box dimensions and use a linear velocity model for the forward kinematics. We tune the covariance parameters to obtain good performance on the sample video. A track is deleted if no detection can be matched to it for 30 continuous frames. This simple tracking system can handle momentary occlusions and does a reasonably good job of distinguishing objects that are nearby. However, one drawback of this approach is that it solves a more myopic frame-to-frame association problem instead of allowing tracks to branch for several frames. Furthermore, it does not exploit the object appearance features which can be used to disambiguate nearby objects and avoid ID switches. In addition to locating and tracking the individual hens in the video footage, we have also developed methods for extracting motion information, behavioral indicators, and other metadata about hen movement. Collectively, these signals tell a compelling story about the overall state of the chicken population, both at the micro-level (individuals) and at the macro-level (crowd as a whole). We define mid-level kinematic features as those that define local motion of each bird within its bounding box. Conversely, high-level behavioral features seek to identify animal behavior such as wing-flapping, pecking, and preening which cannot be obtained directly from low-level motion or trajectory information. From the bounding-box trajectories of the individual hens obtained from our tracking system, we can obtain velocities and changes in size (due to wing-flapping, posture changes, etc.) at the micro-level. We use the Lucas-Kanade optical flow algorithm to compute the motion of the individual pixels within the image from one frame to the next. (The Team at Novateur has experience with using fast optical flow methods based on convolutional neural networks that can run in real-time on GPU, which we also plan to test for this application.) We then compute statistical moments (mean, variance, skew, kurtosis) of the flow field to obtain macro-signatures which give indications of the overall motion in the scene. In addition to optical flow as macro-level motion indicator, we use the bounding box outputs from our tracker to crop the flow field to the regions corresponding to chickens. This gives a local per-frame motion descriptor for each detected object from which we can compute statistical moments. These individual descriptors can be kept separate or can be merged into a bag for global analysis. In order to reduce the effects of spurious detections, we developed a technique for weighting the contributions of the vectors in the flow-field based on the correlation heatmap obtained from our siamese object detector. This allows us to discount motion vectors which do not have strong appearance similarity to the template chickens. We track the evolution of these behavioral markers over time to identify peaks in activity which may correspond to events of interest. We use a simple peak detection algorithm for this form of anomaly detection. Because all of the motion vectors described above can be attributed to a pixel location in the image, they can be rectified to correct for the perspective distortion in the camera setup, in order to make activities at different depths comparable. We developed a simple method for carrying out this rectification by exploiting known geometric structures in the poultry facility (such as parallel lines and vanishing points).
Publications
|
|