Data Science News Flash: 09-12-2019

The latest Data Science articles - algorithmically curated, ranked, and summarized just for you.

News Flash is a weekly publication that features the top news stories for a specific topic. The stories are algorithmically curated, evaluated for quality, and ranked so that you can stay on top of the most important developments. Additionally, the most important sentences for each story are extracted and displayed as highlights so you can get a sense of what each story is about. If you want more information for a particular story, just click on it to read the entire article.

You can see the other topics we have News Flashes available for here and sign up to receive any that you're interested in.

The evolution of machine learning to command dark data | ITProPortal


  • Machine learning is an application of artificial intelligence (AI), that provides systems with the ability to automatically learn and accomplish the equivalent of continuously running programmes in a fraction of the time.
  • In the case of dark data, the process of learning starts with data observations to look for patterns that will help make better decisions in the future based on previous examples.
  • In today’s market where data is competitive currency, dark data is critical as it allows businesses to learn more about elements of their operations.
  • By adopting new technologies around machine learning, such as deep learning, businesses can combine structured and unstructured data to generate high-value results.
  • By allowing machine learning to unleash dark data, businesses can reveal new insights and knowledge that will yield a greater competitive advantage and boost the bottom line.

Insilico Medicine’s GENTRL: Artificial Intelligence Continues to be Priority for Biopharma | BioSpace


  • In order to speed up drug development and accelerate target identification, biopharma is increasingly turning to artificial intelligence (AI) and machine learning.
  • The two companies will focus on using AI and machine learning to discover and develop new drugs for chronic kidney disease (CKD) and idiopathic pulmonary fibrosis (IPF).
  • The commonality was Concerto focuses on oncology-specific Real-World Data (RWD) and advanced AI for Real-World Evidence (RWE) generation, an area of increasing interest for biopharma companies.
  • Concerto HealthAI will collaborate with Pfizer on Precision Oncology using Concerto’s eurekaHealth platform, artificial intelligence (AI) models and Real World Clinical Electronic Medical Record (EMR) and healthcare claims.
  • The team in Microsoft’s Station B initiative will use AI and machine learning to increase the yield and improve the purity of Oxford Biomedica’s lentiviral vectors while cutting costs.

9 Skills A Data Scientist Must Have To Land A Job: AIM Skills Study 2019


  • Analytics India Magazine conducted Data Science Skills Study to understand key trends driving skills economy and how data scientists’ toolchains are evolving.
  • Python is one of the versatile languages which has been used by the data scientists to carry out data science and machine learning projects.
  • For deep learning, a data scientist can use TensorFlow, Keras, Theano, and Pytorch to solve complex and more advanced problems in data science and deep learning.
  • Besides Logistic Regression, other algorithms such as decision trees, convolutional neural and Feedforward Neural Network networks are also in demand for data science projects.
  • Our survey reveals that 43% of data scientists work on Amazon Web Service (AWS) while 33% and 16% of data scientists use Google Cloud and Microsoft Azure respectively.

Explorium secures $19M funding to automate data science and machine learning-driven insights | ZDNet


  • Machine learning is a powerful paradigm many organizations are utilizing to derive insights and add features to their applications, but using it requires skills, data, and effort.
  • Turning Big Data into Business Insights Businesses are good at collecting data, and the Internet of Things is taking it to the next level.
  • Shlomo concluded by mentioning Explorium aggregates multiple data sources into a single coherent and meaningful piece of data using machine learning methods, as well as structuring untapped data from online assets such as photos, extract entities, and actions in web text (e.g., articles).
  • It enables Explorium to understand which data sources the user can connect to, which sources the platform can automatically explore, and which features can be automatically generated later on.
  • Explorium is not an artificial bridge between the data and AI silos that could be mimicked through a transactive partnership.

Leading data company invests in Madison-based startup


  • Redwood City, California-based Informatica Corp. announced this week it is the sole investor in GreenBay Technologies Inc., a Madison-based company founded by University of Wisconsin-Madison professor of computer science AnHai Doan and two Ph.D. students that uses artificial intelligence and machine learning techniques for data management.
  • Informatica said in a news release it will use GreenBay Technologies’ CloudMatcher technology to develop innovations that strengthen the impact of its AI-powered CLAIRE engine.
  • In explaining Informatica’s decision, Amit Walia, Informatica president of products and marketing, said GreenBay Technologies’ machine learning-based data management innovations are “practical, effective, and more advanced than anything we’ve seen”.
  • The fact a tech company of Informatica’s size was investing in GreenBay Technologies shows the startup has interesting technology, said Kathleen Gallagher, executive director at the Milwaukee Institute, a nonprofit group that promotes local technological innovation and entrepreneurship.
  • The announcement by Informatica closely follows news that UW-Madison has established a new School of Computer, Data & Information Sciences.

Examining Top Data Analytics Firms in the 2019 Forbes Cloud 100


  • The editors at Solutions Review have perused the 2019 Forbes 100 and identified these top data analytics firms as warranting extra attention.
  • Capabilities are delivered via an SaaS-based data analytics platform that enables Dev and Ops teams to work closely on the infrastructure to resolve performance issues and ensure that development and deployment cycles finish on time.
  • Databricks offers a unified analytics platform that allows users to prepare and clean data at scale and continuously train and deploy machine learning models for AI applications.
  • The data analytics product also allows users to combine data and uncover insights in a single interface without scripting, coding or assistance from IT.
  • Users can then apply machine learning and data science techniques to build and deploy predictive data flows.

Am I Ready to Build a Data Science Team?


  • To establish both, we recommend developing a list of every possible data science project; going through every team that could benefit from data science, and within each team, separating projects by internal- and external-facing.³ After creating a master list of projects, fill out the following for each project:.
  • As in the Gannt Chart above, imagine each data science project you’re doing has three essential phases: collection, storage, and analysis.
  • If two projects share the same data sources, you can optimize work streams by working with a data engineer to collect and store the data in one shot.
  • Even with state-of-the-art infrastructure and a talented data science team, low quality data will yield low quality output.⁵ Data projects are cross-functional, especially when working with systems that handle data.
  • If you have verified that the available data supports your projects, identified people who can provide technical support, and set aside budget for team and tools, you are in a strong position to start the data science team at your company.

Generative AI: A Key to Machine Intelligence?


  • For completeness of the picture, let’s have a look at how machine learning is defined in some rather classical ML books like “Pattern recognition and machine learning” by Christopher Bishop.
  • Maybe now generative models start looking to you as nice, more complete extension of standard statistical learning framework, that supposed to learn more general knowledge about the underlying data.
  • Why most of the university courses, MOOCs and tutorials are full of supervised learning and unsupervised generative modeling appears only on the blogs of some Ph.D. students and academic publications?
  • The real goal was to broaden the data science mindset a bit, to remind you about the fundamentals of statistical learning and to show, why the AI research community is partly obsessed with generative modeling and it’s not just for amusement.
  • Also, I recommend you to read the article of mine on other alternative use cases of generative models where I show uses cases beyond supervised learning.

Trifacta, A Data Cleaning Startup, Raised $100 Million From Investors | Fortune


  • Trifacta, a startup that specializes in cleaning corporate data so it can be analyzed, has raised $100 million in funding, underscoring current investor appetite for data-crunching startups amid the artificial intelligence boom.
  • New Trifacta investors who were part of the funding round include Telstra Ventures, Energy Impact Partners, Japanese mobile operator NTT Docomo, BMW i Ventures, and Dutch bank ABN AMRO.
  • In the past, coders would have to write software rules that could help clean the data, but the task can be time consuming for corporate data scientists.
  • Data-cleaning tools will help scientists put an end to wasting time working as “glorified data janitors,” said Trifacta CEO Adam Wilson said, and instead, focus on analyzing the information, Wilson said.
  • Wilson said the company plans to use some of the funding to expand into the Asia-Pacific region, which was one of the reasons Trifacta took investment from the Australian venture capital firm Telstra Ventures and NTT Docomo.

Trump, Tweets, and Trade


  • Before diving into the details of the trading logic, it’s important to cover some key details and assumptions I made prior to developing the trading strategy:.
  • At the core of any sentiment trading strategy is the theory that animal spirits will drive markets higher when sentiment is high and lower when sentiment stumbles.
  • The daily trading strategy logic for the base case, positive sentiment signal, and negative sentiment signal is listed in detail below:.
  • A quick eyeball test indicates the majority of buy signals were generated in market uptrends and sell signals were generated around minor selloffs, which is great news for the strategy.
  • Finding a profitable backtest is a worthwhile investigation, but the true success of any trading strategy is how it performs in the future on live data.

Produced and Sponsored by:

Innovative Data Science & Advanced Analytics Solutions

Provide Feedback | Unsubscribe