we are district data labs
District Data Labs is a data science research institute, data product incubator, and open source collaborative where people from diverse backgrounds come together to work on interesting projects, push themselves beyond their current capabilities, and help each other become more successful data scientists.
Some of the activities our members participate in include:
- Conducting research on advanced data science topics
- Developing useful data products and tools
- Speaking at conferences and meetups
- Publishing books, papers, and articles
- Teaching data science courses
- Consulting and advising at companies
- Contributing to open source projects
If you have a passion for working with data, some spare time on your hands, and a drive for making yourself and those around you better, please consider joining us!
MEET OUR TEAM
Tony Ojeda is an accomplished data scientist, author, and entrepreneur with expertise in streamlining business processes and over a decade of experience creating and implementing innovative data products. He believes that technological solutions should amplify or extend human abilities, and he is deeply passionate about advancing the field of data science and the abilities of those who practice it. Tony has a Masters in Finance from Florida International University and an MBA with concentrations in Strategy and Entrepreneurship from DePaul University. In addition to being the Founder and CEO of District Data Labs, he is also a Co-Founder of Data Community DC, a non-profit organization that promotes the work of data scientists through community-driven events.
Benjamin Bengfort is a Data Scientist who lives inside the beltway but ignores politics (the normal business of DC) favoring technology instead. He is currently working to finish his PhD at the University of Maryland where he studies distributed computing (and machine learning). His focus is on consistency and availability in distributed data systems, particularly those in partition prone networks outside of data centers. The lab next door does have robots and, much to his chagrin, they seem to constantly arm said robots with knives and tools; presumably to pursue culinary accolades. Having seen a robot attempt to slice a tomato, Benjamin prefers his own adventures in the kitchen where he specializes in fusion French and Guyanese cuisine as well as BBQ of all types. A professional programmer by trade, a Data Scientist by vocation, Benjamin's writing pursues a diverse range of subjects from Natural Language Processing, to Data Science with Python to analytics with Hadoop and Spark.
Dr. Rebecca Bilbro is Lead Data Scientist at Bytecubed, where she and her team use machine learning and Python to build custom data solutions. With District Data Labs, she has conducted research on semantic network extraction and entity resolution, as well as high dimensional information visualization, which led to the development of the Yellowbrick Project. Rebecca is also an organizer for Data Innovation DC and sits on the Board of Directors for Data Community DC. She earned her doctorate from the University of Illinois, Urbana-Champaign, where her research centered on communication and visualization practices in engineering.
Laura is a data and software engineer at Industry Dive, a B2B media company, where she implements and operates full-stack solutions with both the web and data teams using primarily Python tools and frameworks such as Django, Flask, pandas, and scikit-learn. She also contributes to open source projects and conducts research with a focus on neural network implementations for NLP as a faculty member of DC-based research and education organization District Data Labs. She is an advocate of technology literacy, teaching workshops, webinars and classes in the DC area with Georgetown University SCS, District Data Labs, and NYCDA.
Will Sankey is a Data Scientist with Xometry, a rapidly growing 3D printing and CNC machining services firm in Maryland. Prior to this position Mr. Sankey worked on healthcare evaluation projects at L&M Policy Research -- helping to reduce opioid overutilization, among other projects. He received a master's degree in public policy from the Johns Hopkins University and is interested in helping individuals new to the field of data science survive and thrive.
Daniel Chudnov is a librarian, software developer, and data scientist with two decades of experience implementing software solutions to complex data and information access, integration, and preservation problems. He works as a consultant based in Washington, DC, prior to which he has worked as Director of Scholarly Technology and principal investigator on federal grants at GWU Libraries, building large-scale digital collections at the Library of Congress, and as a programmer and librarian at Yale University School of Medicine and MIT Libraries. Daniel earned a Bachelor's in Economics and a Master's in Information Science at the University of Michigan, and also earned a Master's as a recent graduate of the Business Analytics program at GW School of Business, where he teaches Data Management. He will always jump at a chance to work on innovative projects that combine diverse data sources, computing power, and coherent user experience to empower people to make better decisions and share knowledge.
Sasan is a developer with over ten years of experience creating innovative solutions around challenging datasets. Currently, he is working as a Data Engineer at Commerce Data Service, a public startup, to create new data products for various agencies in need of technical expertise and guidance. Previously, he served as a Technical Architect and team lead for the Enforcement Data project, supporting Open Government initiatives within the Department of Labor. Sasan got his start at James Madison University in Computer Information Systems, and has since been working in such areas as data storage, data cleansing, data processing, and entity resolution.
Nicole Donnelly is a former computer forensics and electronic discovery consultant, now working as a data management IT specialist with the Office of the Chief Technology Officer, District of Columbia. She believes a city that consumes and understands its own data is acting in the true spirit of public service by improving the lives of its residents and hopes to continue to explore this further as the open data movement grows. She has a professional certificate in data science from Georgetown University and has continued to work with the program first as the teaching assistant and now as an instructor. Nicole has completed the Data Science Immersive program at General Assembly and has Bachelor’s degrees from Rutgers University in Computer Science and Art History.
Will Voorhees is a software development engineer that specializes in writing enterprise security tools. For the past five years he's been working on products for protecting communication between services in a service oriented architecture. These products run on hundreds of thousands of servers all over the world. In a previous life, he was a network administrator and web developer. Will has a Master's in Computer Science from North Dakota State University and a Bachelor of Science (also in Computer Science) from the University of Minnesota.
Kyle Rossetti is a budding analyst in the field of data scientist. He is currently a researcher at District Data Labs, focusing on the topic of entity resolution. His research areas of interest include Big Data, Internet of Things (IoT), and Machine Learning. He is also a Manager of Business Analysis at SENTEL Corporation, where he utilizes corporate and open source data sets to improve business processes. Kyle has his BS in Psychology from Radford University, and he graduated from the Georgetown SCS Data Science certificate program in 2015. His capstone data product was a flight recommender application.
Tommy is a statistician, mathematician, or data scientist; depending on the problem or audience. He holds an MS in mathematics and statistics from Georgetown University and a BA in economics from the College of William and Mary. He is the Director of Data Science at Impact Research, LLC. Tommy has previously performed economic and statistical modeling and analysis at the Science and Technology Policy Institute, the Federal Reserve Board, and the Institute for the Theory and Practice of International Relations. He has expertise in regression analyses, time series modeling and forecasting, natural language processing, data mining, and other quantitative techniques.
Abhijit is a data consultant working in the greater DC-Maryland-Virginia area, with several years experience in biomedical consulting, business analytics, bioinformatics, and bioengineering consulting. He has a PhD in Biostatistics from the University of Washington and over 40 collaborative peer-reviewed manuscripts, with strong interests in bridging the statistics/machine learning divide. He is always is on the lookout for interesting and challenging projects, and is an enthusiastic speaker and discussant on new and better ways to look at and analyze data. He is a member of Data Community DC and a founding member and co-organizer of Statistical Programming DC (formerly R Users DC).
Keegan Hines is a Data Scientist with IronNet Cybersecurity, focusing on large-scale machine learning applications in cyber defense. He received a PhD from the University of Texas with a focus on computational statistics and neuroscience during which time he taught multiple seminars on statistical methods and R. He is interested in challenging problems in machine learning and distributed computing.
Dr. Evann Smith is the Senior Data Scientist for Thresher, which uses machine learning to help experts solve hard problems with unstructured text, where they specialize in algorithm development for natural language processing and machine learning over big text data. They also conduct ongoing academic research with foci in latent structure modeling and complex networks. They earned their doctorate from Harvard University, where their research focused on the development of quantitative methodologies for social science, with a substantive application to the use of communication technologies in mass mobilization in the Middle East. In their spare time, they are developing a machine learning-based application to help employers combat unconscious bias in job postings.
Fill out the contact form below and we'll get back to you as soon as possible.