David J. Neu

David is a data scientist, software developer, and project manager. He has a B.E. in Electrical Engineering/Computer Science and an M.S. in Computer Science from Stevens Institute of Technology, and a Ph.D. in Operations Research from RUTCOR at Rutgers University. While working in both academic and commercial organizations he has supported commercial, government, and military customers in a wide variety of fields, and has been successful both leading teams and working as an individual contributor. David grew up in New Jersey and now lives in New Hampshire near Dartmouth College with his family and dog.

Contact

He can be found on email at on Twitter as @davidjneu, on LinkedIn as davidneu, and on IRC as davidneu.

Profile

Enjoys developing software and understanding data and is happiest when he's writing software to understand data.

Has programmed professionally in C, C++, Clojure, Common Lisp, FORTRAN, Java, Jython, PL/I, Pascal, Prolog,
Python, R, Scheme, and sh.

Currently does most of his programming in Clojure using mvt.

Uses relational databases such as PostgreSQL, SQLite, as well as NoSQL databases such as DynamoDB.

Develops and deploys systems in Docker containers using Docker Compose.

Runs the Ubuntu distribution of Linux installed from the mini.iso on a Thinkpad laptop.

Uses the dwm window manager, git with magit for version control, and spends the vast majority of his time in emacs.

Leverages AWS services whenever possible.

Select Academic Work

As a member of a Rutgers team working on a DARPA funded research grant he developed the first version of a collaborative search tool known as the AntWorld. He gained experience with various information retrieval tasks and worked with what is now referred to as "big data" by participating in NIST's Text REtrieval Conferences (TREC). Also at Rutgers he spent some time working on document summarization, and presented at the DUC - Document Understanding Conference 2001. His dissertation, "Feature Selection with Applications to Text Classification" was written under Endre Boros. It investigates the mathematical properties of feature ranking functions, and making extensive use of R for both analysis and visualization, it includes empirical studies of the Reuters 21578 test collection.

Recent Professional Experience

As a data scientist and software developer in the Department of Biomedical Data Science at the Dartmouth Geisel School of Medicine he is currently supporting a complex longitudinal research study. Designed and developed a web application for the collection, management, and analysis of data, while ensuring proper implementation of the study protocol. The system is written in Clojure with Postgresql, deployed on Ubuntu Linux using Docker and Docker Compose, and uses Nginx as a reverse proxy.

Also at the Dartmouth Geisel School of Medicine, he collaborated with a team of researchers at The Dartmouth Institute for Health Policy and Clinical Practice on the development of a sample frame for a large survey of healthcare organizations. Using Clojure, as well as R and SQLite, designed and developed a system that provided a hierarchical data structure that allowed researchers to investigate a wide range of sampling frame scenarios by altering complex inclusion criteria.

At Opentoit, LLC, while supporting a USAF contract, he completed an MQTT based communication system. In addition to supporting the publish/subscribe pattern inherent in the MQTT protocol, the system also supports the request/reply pattern and priority messaging. The system is written in Clojure, and leverages AWS managed services including DynamoDB, SQS, S3, autoscaling, ELB, VPC, and EC2, to create a highly available, scalable, and secure system.

As Director, Safety Software Systems at UTRS, Inc. he led teams supporting the FAA's Air Carrier Training Systems and Voluntary Safety Programs Branch and the USAF Safety Center. He worked with the FAA to define the strategy of providing US air carriers with an open source safety system and then led the design, development, operation, maintenance of the open source WBAT system which provides a complete Safety Management System (SMS) and is used by over 180 operators.

He developed mechanisms for collecting and managing data that was sensitive to both individuals and organizations, acted as a trusted steward of this data, and defined analysis and visualizations that protected both individual and organizational identity, and oversaw the development of a taxonomy for classifying safety events.

He also lead the team that extended WBAT to provide the USAF with an open source MFOQA and ASAP system. This system processes binary flight recorder files that are regularly downloaded from fifteen different types of aircraft into a format that can be readily analyzed. Each month, these files are collected from thousands of individual aircraft, and hundreds of thousands of flights, resulting in terabytes of data. The system then applies algorithms to the data associated with each flight to identify points of interests, and provides tools for analyzing the resulting aggregate information.

He managed the transition of both WBAT and the USAF systems from an on-prem hosting facility to AWS, and managed the group that secured the Authority to Operate (ATO) per the Defense Information Assurance Certification and Accreditation Program (DIACAP) for the USAF system hosted at AWS.

He wrote software in Clojure, Common Lisp, Java, Python, R, Scheme, and sh, including a Clojure framework that provides the basis of a next generation system for processing and analyzing binary flight data.

Select Publications

Boros, E., Kantor, P.B., Neu, D. (1999). Pheromonic Representations of User Quests by Digital Structures. In Woods, Larry (Ed.). Proceedings of the 62nd Annual Meeting of the American Society for Information Science, 36, 633-642.

Kantor, P.B., Boros, E., Melamed, B., Neu, D.J., Menkov, V., Shi, Q., Kim, M.H. (1999). Ant World (Demonstration Abstract). Proceedings of the 22nd Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, 323. SIGIR 1999.

Kantor, P.B., Boros, E., Melamed, B., Menkov, V., Shapira, B., & Neu, D.J. (2000). Enabling Technologies: Capturing Human Intelligence in the Net. Communications of the ACM, 43 (8), 112-115.

Boros, E., Kantor, P.B., & Neu, D.J. (2000). Logical Analysis of Data in the TREC-9 Filtering Track. In D. Harman and E. Voorhees (Eds.). The Ninth Text Retrieval Conference, SP 500-249. Washington, D.C.: U.S. Department of Commerce, NIST Special Publication.

Menkov, V., Neu, D.J., Shi, Q., (2000). AntWorld: A Collaborative Web Search Tool. In Proceedings of the Third International Workshop on Distributed Communities on the Web (DCW 2000). Quebec City, Canada.

Boros, E., Kantor, P.B., Neu, D.J., (2001). A Clustering Based Approach to Creating Multi-Document Summaries. In Proceedings of the 2001 Document Understanding Conference (DUC 2001). New Orleans, Louisiana USA.

Anghelescu, A., Boros, E., Lewis, D., Menkov, V., Neu, D., & Kantor, P.B. (2002). Rutgers Filtering Work at TREC 2002: Adaptive and Batch. In E. Voorhees and L.P. Buckland (Eds.). The Eleventh Text Retrieval Conference, SP 500-251. Washington, D.C.: U.S. Department of Commerce, NIST Special Publication.

Boros, E. Kantor, P.B., Neu, D.J., (2003). Combining First and Second Order Features in the TREC 2003 Robust Track. In E. Voorhees and L.P. Buckland (Eds.). The Twelfth Text Retrieval Conference, SP 500-255. Washington, D.C.: U.S. Department of Commerce, NIST Special Publication.

Neu, D.J., van der Veen, A.P., Harris, R.M., Winner, D. (2007). Web-based Analytic Tool for the FAA's Aviation Safety Action Program. In Proceedings of The Third Safety Across High-Consequence Industries (SAHI) Conference. St. Louis, MO: Saint Louis University.