Inside the Briefcase

Augmented Reality Analytics: Transforming Data Visualization

Augmented Reality Analytics: Transforming Data Visualization

Tweet Augmented reality is transforming how data is visualized...

ITBriefcase.net Membership!

ITBriefcase.net Membership!

Tweet Register as an ITBriefcase.net member to unlock exclusive...

Women in Tech Boston

Women in Tech Boston

Hear from an industry analyst and a Fortinet customer...

IT Briefcase Interview: Simplicity, Security, and Scale – The Future for MSPs

IT Briefcase Interview: Simplicity, Security, and Scale – The Future for MSPs

In this interview, JumpCloud’s Antoine Jebara, co-founder and GM...

Tips And Tricks On Getting The Most Out of VPN Services

Tips And Tricks On Getting The Most Out of VPN Services

In the wake of restrictions in access to certain...

Bringing Artificial Intelligence to Life: The Technical Lifecycle

January 27, 2017 No Comments

Featured article by Austin Ogilvie, CEO and co-founder of Yhat, Inc.

In the wake of an increasingly digital economy, businesses are racing to build operational knowledge around the vast sums of data they produce each day. And with data now at the center of almost every business function, developing practices for working with data is critical regardless of your company’s size or industry.

“Data science,” one of many recently popularized terms floating amidst the myriad of buzzwords, is a field concerned with the extraction of knowledge from data. Practitioners—aptly named “data scientists”—are those charged with solving complex and sophisticated problems related to data usually employing a highly diversified blend of scientific and technical tools as well as deep business and domain expertise.

Data scientists may also create algorithms that are actually capable of learning or improving on their own through a self contained feedback loop, without human interaction. This category of data science is called machine learning. Thanks to a massive decrease in computing costs over the past decade and an industry wide movement toward open source technologies machine learning has propelled the sci-fi fantasy of artificial intelligence products into the realm of a reality.

Take Apple’s Siri as an example. Siri uses a branch of machine learning called deep learning to recognize your speech patterns and learn about your behavior such that you can interact with “her” as naturally (or almost as naturally) as you would with a person. This type of human-computer interaction isn’t possible with strict rules-based business logic alone. Many tech behemoths are committed to open sourcing the AI research that powers some of the world’s most well-known and beloved consumer-facing apps (e.g. TensorFlow from Google or Torch from Facebook to name two).

So how do data scientists take an idea, like an intelligent and personalized product recommender on an app or website and move from concept to product? Broadly speaking, a project begins with some question, goal, or business problem in mind and with varying degrees of focus. In the case of a smart recommendation system, it might all begin with an observation that the majority of users only visit two pages of search results before visiting another site to search for a product.

With a narrow and expressive definition of the problem, data scientists can begin to evaluate different data sets to identify which variables are likely to be relevant to the problem they are trying to solve. Quantitative analysts are usually working in proximity to or in direct collaboration with engineers, marketers, operations teams, product managers, and other stakeholders to gain a robust and intimate understanding of the data sources at their disposal. In the case of our fictitious retailer, the age, shopping history, device, and location might all be relevant to a personalized recommendation.

After firming up the project’s definition and completing a preliminary survey of the data, analysts enter the model-building phase of analytics lifecycle. Identifying the right algorithms and machine learning methods for your problem is largely an exploratory exercise. This phase is characterized by rigorous testing of different algorithms and methods drawing from one or more problem classes (i.e. clustering, regression, classification, and ranking) with the ultimate goal being to identify the “best” way to model some underlying business phenomenon. Practically speaking, data scientists generally use programming languages like R
and Python, which are ideal for cleaning, exploring, and modeling data, and also allow them to leverage cutting edge open source libraries and packages like TensorFlow, DSSTNE, Keras, and Spark.

Alternatively, there are also solutions offering off-the-shelf or drag-n-drop machine learning capabilities that aim to empower non-technical or semi-technical “business users” to play the role of data scientist without writing code. The aim of these products is to remove the quantitative complexity of data science such that users need little or no expertise in machine learning or AI concepts. On the surface, this approach is compelling in that it eliminates the need for data scientists altogether. These products, however, are limited, inflexible, and often can’t support the degree of situational nuance of real-world data problems.

For example, consider an insurance company that wants to use AI to estimate car damages using the photos taken by policyholders and submitted with claims on their mobile devices. This is a compelling user experience for consumers that would greatly expedite the turnaround time on claims. Data science teams can tap into a wealth of open source AI tools when building a model to detect paint and body damage in photos, but few, if any, drag-n-drop machine learning tools provide the flexibility required. Additionally, in many sectors like finance and healthcare, it is critical, and often required by law, that the logic for decision can be traced and explained.

At this point of the artificial intelligence lifecycle our data scientists have built, tested and refined models in R or Python. But how does this model become a part of an app or product? Generally application developers gravitate towards frameworks  like .NET, Ruby on Rails, Node.js, or JVM, which are not inherently compatible with data scientists’ tools. Application developers may choose to port data scientists’ code into their own framework, though this translation process can take months of engineering and IT time and is highly error prone.

Unfortunately, due to the complexities of implementing algorithms into production, over 90% of models written today will never move beyond data scientists’ laptops. To overcome the implementation hurdle and start realizing the business value of their data science team’s efforts, many companies are beginning to opt for model deployment software, which uses REST APIs to expose algorithms and allows data scientists to ‘productionize’ their code without porting it to another language. I’m confident that as more companies equip their data science teams with these new technologies, their efforts to bring AI to “life” and to market will proliferate, impacting an increasing number of the products we all interact with daily.

 Austin Ogilvie Headshot

Austin Ogilvie is the CEO and co-founder of Yhat, Inc., a machine learning software company based in Brooklyn. He was previously at OnDeck Capital, the largest online small business lender in the US.

 

 

 

 

 

 

Leave a Reply

(required)

(required)


ADVERTISEMENT

Gartner

WomeninTech