Top Video Artificial Intelligence and Machine Learning at NAB 2018

Video artificial intelligence was a massive theme at NAB 2018 with a majority of the video publishing technology companies showing off some form of AI integration. In my previous post How Artificial Intelligence Gives Back Time time is money in the video publishing business. AI is set to be a very important tool why all the big guns like AWS (Elemental), Google (Video Intelligence), IBM (Watson) and Microsoft (Azure) had digital AI eye candy to share. There was a feeling of a meet too with all of them were competing to weave their video annotation/labeling – speech to text APIs into a variety of video workflows.

Top Video AI use cases:

  1. Labeling – The ability to label the elements within a video specific scene selection, people, places, and things.
  2. Editing – Segmenting by relevance, slicing up the video into logical parts and producing.
  3. Discovery – Using both annotation and speech to text to expand metadata for funding specific scenes within video libraries.

Challenges

One of several challenges is this ALL or nothing situation. Video publishers assets can be on many hard drives or encoded without lots of metadata. There are companies that provide services like Axel to index those videos and make them searchable with a mixed model of on-prem tech and cloud services. Dealing with live feeds requires hardware and bigger commitments. Most publishers are not willing to forklift over their video encoding and library to another provider without a clear ROI justification. The other big ROI challenge is video publishers don’t have a lot of patience and the pressure to increase profits on video is higher now with more competition in the digital space across all channels. Selling workflow efficiency in AI won’t be a big enough draw over AI generating substantial revenue solving a specific problem. The pain isn’t high enough to make a big AI investment. There are lots of POC right now in the market, however, not one product creates a seamless flow within a video publisher’s existing workflow. Avid and Adobe are well positioned for the edit team since their products are so widely used. Other cloud providers are enabling AI technology not a specific solution.

AI Opportunity

Search and discovery was the biggest theme using AI to do image and speech to text analysis. Compliance with Closed Caption to make video accessible in digital will be mandated driving faster adoption. Editing video via AI is in its early phase, however, the technology is emerging fast. There are some exciting examples of AI created video but at scale is another. Of the many talks at NAB some exciting direction on AI in Video were discussed around video asset management. Here are a few examples of what we demoed at NAB 2018 showing promise in the video intelligence field.

Adobe Sensi

Adobe Video SegmentingAdobe had a big splash with their new editing technologies and using AI to enhance the video editing process. Todd Burke presented Adobe Sensi their AI entry into video intelligence. The video labeling demo and scene slicing we’re designed to help editors create videos faster and simplify the process. The segmenting was just a prototype and video labeling demonstrated the API extension integrated within Sensi. Adobe Labeling Demo

IBM Watson

IBM Watson Video SegmentingIBM’s demo was slick and pointed to the direction of using machine learning to process large amounts of video to pull out interesting parts of the video. Doing the announcer and crowd response analysis added another layer of segmentation. You can see a live demo of their AI highlight for the Master. They did the same for Wimbelton slicing up the live feed they were powering for the events and creating “cognitive highlights”. It wasn’t clear if these highlights were used by the edit team or if this was a POC. Regardless there was both image and text analysis of the steams occurring and demonstrated the power of AI in the video.

Avid

Avid Video analysisThe Avid demo was just that. They created a discovery UI on top of API’s like Google Vision to assist in the video analysis for search and supporting edit teams. Speech to text and annotation in one UI has its advantages. It’s wasn’t clear how soon this was going to be made available over a development tool. Avid Labeling

Google Vision

Google Vision Zoro labelingThe team over at Zora had by far the slickest video HUB approach. I believe the play for Google is more around their cloud strategy trying to attract storage of the videos and leverage their Video Intelligence to enable search over all your video assets. Google’s video intelligence is just getting started and their opening up of the AI foundation Tensorflow makes them one of the top companies committed to video AI. I like what Zora is doing and can see editing teams benefiting from this approach. There was a collaborative element too.

Microsoft Azure

Azure GreyMeta2GrayMeta UI was slick and their voice to text interface was amazing. This was all powered by Azure. Azure Video Indexer is the real deal and ability to identify face identification has broad use cases. Indexing voice isn’t new but having a fast and slick UI  helps enable adoption of the technology. They can pinpoint parts of the video just on the text along. There is a team collaboration element around the product have a Slack feel. The approach was making all media assets searchable.

AWS Elemental

There were several cool examples of possibilities with Amazon Rekognition - video analysis, facial recognition and video segments. Elemental (purchased by Amazon) core technology is a video ad stitching whereby video ads are inserted into the video directly. They created UI extension demonstrating some possibilities with Rekognition.  It wasn’t clear what was in production over the demo. The facial recognition around celebrities looked solid. AWS Singular Analysis Tracking PeopleElemental had a cool real-time object detect bounding boxes showing up on sports content too. This has many use cases, however, creating more data for video publishers to access when the amount of data they can manage needs to be addressed before adding another data firehosed. AWS Elemental label celebrity words SM

Conclusion

Video artificial intelligence is just getting started and will only improve with greater computing advancements and new algorithms. The guts of what’s needed exist to achieve scale.  The major use cases around video discovery and search are set to improve dramatically with industry players opening up more API’s. Video machine learning has great momentum utilizing these API’s to crack open the treasure trove of data locked away inside of video. The combination of video AI and text analysis truly creates a massive metadata for the multitude of use cases where voice computing can play are roll. Outside of all the AI eye candy there needs to be more focus on clear business problems vs. Me Too. More like what’s the end product and how will it make the video publisher more revenue?

How Artificial Intelligence Gives Back Time

How AI – Artificial Intelligence and machine learning gives back time? It’s not a secret that AI is here and coming much faster than many other technology booms. Some are saying we’re in the 3rd wave of computing. In our previous post 3 Ways Video Recommendation Drives Video Lifetime Value we talk about how machine learning is transforming finding and recommending videos to enhance consumer experience. For me, I’m just excited to be part of the machine learning business and creating powerful products focused on improving the digital video experience. I was recently commissioned to put together a short video and deck on how Artificial Intelligence is transforming the device-based experience for consumers. As part of this project the brands we’re looking to understand the different ways AI is going to potentially impact their business. Here were the main topic areas.

  • How do you stay ahead of where AI is headed?
  • How should AI be leveraged to enhance brand trust, improve engagement and help consumers get jobs to be done in a way that is valued by consumers?
  • How can AI be employed to create better personal performance for individuals?

The presentation was to top brands like Scripps Network, Hertz, Bacardi, Planet Fitness, Arizona State University and DX Marketing.

Video Transcript ….

Hi I’m Chase McMichael CEO and co-founder of infiniGraph. InfiniGraph focuses on increasing video lifetime value for video publishers and broadcasters and we do that through processing vast amounts of their video data and understanding what visual in the video engage consumers. By measuring what image or video clips are most engaging within certain scenes we’re able to increase video consumption showing the right images or clips to excite the consumer. Here’s a great example of a video clip that we extracted out of a video for CBS. By putting this specific clip in front of the right person at the right time we’re able to dramatically increase video take rates as produce a better consumer experience.

I was asked to talk about machine learning / artificial intelligence and how it would affect brands and improve consumer experience around devices. One of the exciting things about artificial intelligence primarily in my point of view is AI will give back time to individuals. AI is about making smart decision for them or providing insights proactively.

So how should AI be leverage to increase brand trust, engagement and help consumers?First off brand trust, brand trust is about anticipating your consumers, being able to be very proactive when they interface with you. It’s important to actually recognize them and provide them with incredible value and service. This is something that all brands have struggled with. Its not you know someone comes into a retail establishment or comes online but laking responsive is lost opportunity. A big opportunity is personalization. The ability to personalize ones experience is a big deal right now. Companies are really utilizing their data in some smart ways especially in the retail segment we’re seeing this with Amazon and a lot of the movie companies like Netflix trying to customize the experience for their audience.

The other thing that we’re seeing around brand trust is the ability to really not only be intuitive and responsive but proactive. Being proactive requires a much higher level of intelligence around your data. Taking that data to the next level of insights where you’re really thinking is AI. The key is anticipation like what does that consumer going to buy or how are they going to respond? When they purchase a product being more proactive creates an incredible experience. Again brand trust is easy to lose.  Brands spend many many decades or a hundred years on creating trust and all of a sudden something happens and the internet revolts. They become completely eviscerated.

Its critical brands are responsive to what’s happening across the social web and monitoring intelligently. How you interface with your consumers across mobile, social or over cloud application all requires intelligence.

Don Peppers back in their late 90s was doing what’s called one-to-one marketing and this is really the onset of personalized experience. If you’re going to enhance consumer engagement you really have to put in front of that consumer something that visually and cognitively gets them excited. Are you creating an emotional response? Without emotion people do not recall information so if you don’t make an emotional response with someone the ability for them especially in this distracted economy to engage approaches zero.

Getting consumer to recall is very difficult so good service is expected the reality is people want to be wowed and that’s really comes down to how well your consumer touch points are response and intuitive.  Do you know about the individual interacting with you on a day to day or month to month or year to year basis? A core component to any company is understanding your consumers and their individual behavior. A customer interacts with your brand do you have the ability to recommend or provide insights that helps and again give them back time? If not you’re really have done them a disservice. Your interface has to be fluid or you create drudgery. 

Improving engagement is the big win with artificial intelligence. System designed to predict and be intuitive through giving back time will lead their industry.  The core components that any enterprise must think about is how their enterprises is going to re-engineer around consumer data as well as capturing data to drive an active feedback loop. This feedback mechanism between the consumers and that touch point will be the foundation of an AI system.

Help consumers and creating a frictionless environment will win your consumer over. Driving proactive actions that actually work. The other question you have to ask yourself is how are you utilizing that information to create a very robust profile so that you’re actually having a conversation with your customer.

You actually know a lot about their history and you know a lot about what was successful. Start engaging with you consumer by understanding their product usage and leverage that to information to improve the experience.  From an AI perspective, now you have a system out there mining data to surface functional clusters of information both visually maybe vocally as well as across just standard data sets.

Think big here because we’re now in such a connected community in society that with a push of a button I can share a picture with thousands of people and then the whole conversation spurts up around maybe your product being both positive and negative sense. How are you inserting yourself in that conversation proactively is very important.

How do you physically help a consumer really comes down to can you be intuitive on their needs. Think about personal data assistant or personal AI assistance. Personal assistants are going to be very intuitive very smart and crawl lots of different data sources. These AI assistance are proactively telling you to do thing to simplify your life. An example is let them go out and find this information for you. These types of digital assistants are gonna be extremely important in people’s lives because now things that they had to spend their mental time on they will be spending more time on more higher cognitive thinking than the mundane check off tasks that we do today.

If a brand is able to give time back to there consumer creating an intuitive and very frictionless experience will be the go to experience. Now your brand will dominant with that customer because they’re going to create not only the loyalty clearly but creating engagement through helping that consumer.

Another areas that is creating lots of buzz is artificial intelligence taking over jobs. Clearly you as a big industry or brand don’t want to become the job killer in the industry and that’s a big issue for executives thinking about implementing intelligent automation. Everyone is reading everywhere that the robots are going to take over the world the reality is how are you augmenting your staff. What can you do to enable your staff to be more intelligent utilizing internal resources to be more efficient and effective when they interface with consumers.

The real crux in this whole equation is how to enhance that experience with consumers and be able to empower my employees to be smarter and faster while creating a symbiant relationship.

Another thing around artificial intelligence is you really need to be thinking not in quarters but in decades. The companies that have really focused on digital transformation and utilizing data intelligence to transform their business will be the disruptors. Are you going to be the leader, because those that have executed AI in their business will have the speed and ability to adapt enabling them to trump anything that comes to the table. Think AI first.

A picture is worth a thousand words so in my business videos are worth tens of thousands of words and for us we’re wanting to find that unique image or video clip within a video segment or even a long video formats that really gets consumers excited and gets them engaged. This visual intelligence is critical in my business and very important for many brands. Using visual intelligence especially in video for marketing is an incredible opportunity. What you put in front of your consumers and can learn from that engagement around those visual properties within the actual images themselves is insight. The ability to adjust video is a competitive advantage creating a higher order of thinking where you’re giving a machine the ability to transform content in real-time. That’s that artificial intelligence we talked about previously so really being able to think about how do I take my brand in my industry and start thinking about all the data that’s coming in.

Its all about data in and quality and consumer engagement out.

Machine Learning, Video Deep Learning and Innovations in Big Data

Video Deep Learning Machine Learning Paul Burns Talk at Idea to IPO Innovations in Big Data

Paul Burns CDS at InfiniGraph talks on Video Deep Learning, Machine Learning  at Idea to IPO on Innovations in Big Data

Paul Burns Chief Data Scientist at InfiniGraph provides his point of view on what he has learned from doing massive video processing and video data analysis to find what images and clips work best with audiences. He spoke at the event Idea to IPO on Machine Learning, Video Deep Learning and Innovations in Big Data. Quick preview of Paul’s insights and approach to machine learning and big data.

Paul Burns Chief Data Scientist InfiniGraph working with start up involved in mobile video intelligence. I’ve had a bit of a varied career although a purely technical I would say started off in auto-sensing that’s 15 years doing research and RF sensor signal and data processing algorithms. I took a bit of a diverted turn in my career a number of years ago got a PhD and bioinformatics some works in the life sciences in genomics and sequencing industry for about three years. At the moment now I have turned again into video so I have range of experience with working with large datasets and learning algorithms and so hopefully I could bring some insights that others would like here.

My own personal experience is one in which I’ve inhabited a space very close to the data source and so when I think about big data I think about opportunities to find and discover patterns that are not apparent to an expert necessarily or they could be automatically found and used for prediction or analysis or health and status of the sensors at levels of effectiveness. There’s a lot of differences in the perception of what big data really is other than there’s the common thread that seems to be a way of thinking about data and I hate the word data. Really data is so non descriptive it’s so generic so that it’s it has almost no meaning at all.

I think of data as just information that’s stockpiled and it could be useful if you knew how to go in and sort through the stockpile of information to find patterns. How to find patterns that persist and can be used for predictive purposes. I think there’s been a generally slow progress over many decades and why this explosion in recent years is primarily because of the breakthroughs in computer vision and advancements in multi layer deep neural networks particularly processing image and video data.

This is something that’s taken places over the last ten years first with the breakthrough the seminal paper that was authored by Geoffrey Hinton in 2006 which demonstrated breakthroughs and deep multi-layer networks neural networks and then with the work that was published towards the ImageNet the competition in 2012 that made the significant advancement in performance over more conventional methods.

I think the major reason why there’s all this excitement is because visual perception is so incredibly powerful. That’s been an area where we’ve really struggled to make computers relate to the world and to understand and process things that are happening around them. There’s this sense that we’re on the cusp of a major revolution and autonomy. You can look at all the autonomous vehicles and all the human power and capital being put into those efforts.

Paul answers question on Privacy:  Honestly, I think privacy has been dead for some time the way it should be structured is the way Facebook works I can choose to opt into Facebook and have a lot of details about the gory details of my life exposed to the world and Facebook. But what I get out of that is I’m more closely connected to friends and family so I choose to opt in because I want them to that reward but privacy issues where I don’t have the opt-out choice is most problematic. There was a government program I’m aware of that happened in the Netherlands some years ago. They adopted a pilot program where people could opt out of their having their Hospital care data published in a government database. The purpose of which was to lean and make patterns with health outcomes. That’s a little controversial because you can have public health the public health benefits of having such a database could be enormous and transformational so it’s a very complicated issue. I’m certainly probably not qualified to speak on this topic. I would say it’s (privacy) long since been dead and we kind of have to do a postmortem.

We’re very fortunate that so much very high quality research has been published, so many very excellent data sets and model parameters are available free download. If starting out we were working on just very generic replication of open systems. Object recognition can be done with fairly high quality free open source code in a week. That was kind of our starting point to be able to advertise mobile video by selecting thumbnails that are somehow more enticing for people to click on than the default ones the content owners provide.

As it turned out this idea our co-founders came up with (KRAKEN VIDEO MACHINE LEARNING how to increase video lifetime value) about a couple years ago. It’s amazing how bad humans are at predicting what other people want to click on it’s amazing. We are as far as we know the only startup that’s solely focused on this core idea which sounds like a small business but with all the mobile video volume an advertising revenue that’s out there and growing.

What I do is when I have a hard problem I try to stockpile as much data to create the most thorough training set that I can possibly create and I think the most successful businesses will be the ones that are able to do that. It turns out there there are actually companies all they do is help you create training sets for your machine learning applications we use a variety of methods to do that crowdsourcing is one common way that’s really expensive to it’s far more expensive I thought it was even possible. Getting startups to find a way to harvest rich training sets that are valuable for inference are potential to be huge winners. It just turns out to be very hard to do.

Another area that is big is wearable technology for the purpose of health monitor personal health. I think that’s an area that has tremendous potential just because you know your physician is starving for data. You have to make a point to see your doctor schedule it etc. So what do they do? They weigh you and take your blood pressure ask how old you are that’s about it. I mean that’s nothing right they know they do not know what’s going on with you. Maybe it’s personality dependent but I would be very much in favor of disclosing all kinds of biometric information about myself it’s continuously recorded and stockpiled in a database and repeatedly scanned by intelligent agents for anomalies and doctors appointments automatically scheduled for me. Same thing with any complicated piece of machinery you know it could be a car it could be parts of your business. This kind of invasive monitoring I think will come with resistant but could be unleashed as people see the value in disclosing.

See full panel here Idea to IPO

3 Ways Video Recommendation Drives Video Lifetime Value

Video recommendation and discovery are very hot topics across video publishers looking to drive higher returns on their video lifetime value. Attracting a consumer to watch more videos isn’t simple in this attention deficit society we live. However, major video publishers are creating better experience using video intelligence to delight and enhance discovery and keep you coming back for more. In this post we’ll explore the intelligence behind visual recommendation and how to enhance consumer video discovery.

Industry Challenge

Google Video Intelligence Demo At Google Next 17

Google Video Intelligence API demo of video search finding baseball clips within video segments.

Last year we posted on Search Engine Journal How Deep Learning Powers Video SEO describing the advantages of video image labeling and how publishers can leverage valuable data that was otherwise trapped in images. Since then, Google announced at Next17 Video Intelligence . (InfiniGraph was honored to be selected as a Google Video Intelligence Beta Tester) The MAJOR challenge with Google cloud offering is pushing all your video over to Google Cloud, cost per labeling the video at volume and loosing control of your data. So how do you do all this on a budget?

Not all data is created equal

Trending Content - Lacks Image Based Video Machine Learning

Trending Content is based on popularity vs content context and the consumer content consumption.

And, not all video recommendation platforms are created equal  The biggest video publishers are advancing their offerings with intelligence. InfiniGraph is addressing this gap between using video intelligence and creating affordable technology otherwise out of reach.

Outside of the do not track, creating a truly personalized experience is ideal. For VOD / OTT apps creates the best path to robust personalization. For web a more generalized grouping of consumer is required.

See how “Netflix knows it has just 90 seconds to convince the user it has something for them to watch before they abandon the service and move on to something else”.

Video recommendation platforms

Video Recommendation Mantis Powers by KRAKEN Video Machine Learning

Image based video recommendation “MANTIS”. Going beyond simple meta data and trending content to full intelligent context. Powered by KRAKEN.

All video recommendation platforms are reliant on data entered (called Meta data) when it was uploaded to a video content management system.  Title, discription etc. The other main points of data capture plays, time on video and completion indicating watchablity. There is so much more to a videos than raw insights. Did someone watch a video is important but understanding the why in context of other like videos with similar content is intelligence. Many site have trending videos, however, promoting videos that get lots of plays creates a self fulfilling prophecy because trending is being artificially amplified and doesn’t indicate relevance.

An Intelligent Visual Approach

Video Machine Learning, Going beyond meta data is key to a better consumer experience. Trending only goes so far. Visual recommendation looks at all the content based on consumer actions.

Going beyond meta data is key to a better consumer experience. Trending only goes so far. Visual recommendation looks at all the content based on consumer actions.

Surfacing the right video at the right time can make all the difference if people are staying or going.  Leaders like YouTube have already become to leverage artificial intelligence in their recommending videos producing 70% greater watch time. Recently they included animated video previews for their thumbnails pushing take rates even high. This is more proof consumer desire intelligent recommendation and slicker visual presentation.

InfiniGraph provides a definitive differentiation using actions on images and in-depth knowledge of what’s in the video segments to build relevance. Consumer know what they like when they see it. Understand this visual ignition process is key to unlocking the potential of visual recommendation. How do you really know what people would like to play if you really don’t know much about the video content? Understanding the video content and context is the next stage in intelligent video recommendation and personalized discovery.

3 Ways Visual Video Recommendation Drives Video Lifetime Value

1. Visual recommendation – Visual information within video creates higher visual affinity to amplify discovery. Content likeness beyond just meta data opens up more video content to select from. Mapping what people watch is based on past observation, predicting what people will watch requires understand video context.

2.  Video scoring – A much deeper approach to video had to be invented where the video is scored based on visual attribution inside the video and human behavior on those visuals. This scoring lets the content SPEAK FOR ITSELF and enables ordering play list relative to what was watched.

3. Personalized selection - Enhancing discover requires getter intelligence and context to what content is being consumed. Depending on the video publishers environment like OTT or a mobile app can enable high levels of personalization. For consumers using the web a more general approach and clustering consumers into content preferences powers better results while honoring privacy.

The Future is Amazing for Video Discovery

We have learned a great deal from innovative companies like: Netflix, HULU, YouTube and Amazon who have all come a long way in their approach to advanced video discovery. Large scale video publishers have a grand opportunity to embrace a new technology wave and be relevant while creating a visually conducive consumer experience. A major challenge going forward is the speed of change video publishers must adapt if they wish to stay competitive. With InfinGraph’s advanced  technologies designed for video publishers there is hope. Take advantage of this movement and increase your video lifetime value.

Top image from post melding real life with recommendations.

Video Publishers Ready for Video Autoplay Shutdown.

deer-in-headlights Publishers need intelligence via Machine Learning KRAKEN video artificial intelligence Video publishers have been caught off guard with the recent announcement of Apple blocking video autoplay. Even Google is pushing back on bad web ads. The backlash against video autoplay has been festering for some time. If losing video ad revenue and turning consumers off with declining traffic isn’t a wake-up call then what will be? Headlines like this from CNN “Apple’s plan to kill autoplay feature could leave publishers in the dust” should get video publisher’s attention. This clamp down isn’t a joke and Google and Apple are taking a hardline to clean up the web experience when it comes to video. Here we dive deep into how to get ahead of these changes by Apple/Google and increase your video lifetime value.

Facebook started the conversation

Since Facebook started force-feeding video autoplay on us, other publishers followed suit knowing their video volume would go up. However, some major agencies flat out said they would only pay half of the CPMs due to the viewability issues with autoplay. A major advertiser (Heineken) is publicly having challenges getting a 6 sec clip to stick. Publishers say the video relationship with Facebook is “complicated”. This is a topic of constant discussions and other players are outright opting out of video autoplay altogether in favor of a better consumer experience. Apple Autoplay Blocking iOS11 KRAKEN Video Machine Learning is the SolutionThe major catch 22 here is that publishers driving their O&O strategy can’t think of autoplay is a video strategy—it’s a tactic that, in most cases, turns consumers off. If you want to see some of the consumer backlash, just search on Google “how to turn off autoplay” and you will see that this is most definitely a real consumer pain point. With Apple’s latest release of iOS 11 specifically blocking video autoplay, a more thoughtful and intelligent approach is required.

Video Strategy?

Autoplay on off Publisher handling UI KRAKEN Video Machine Learning Drives Higher Play Rates

Publishers are responding to consumer demand by giving the options to turn OFF autoplay video.

A video strategy involves deciding to dominate a content category vertically and be the go-to source for the highest value content in that space. Yes, video is content marketing. People watch video for information, enlightenment, entertainment, etc. Video is a very effective communication tool. Video is mobile and on demand. And being a tool, the publisher has a responsibility to harness and wield that tool surgically vs. a blunt object that pushes video views without consumer consent or value add to paid advertisers. Some publishers understand this, such as LittleThings Inc. They are disabling video autoplay completely and focusing on consumer experience. This has resulted In higher play rates (CTR), and higher CPMs that can be verified and justified to their customers. The other major benefit was consumers engaged more.

“We wanted video views to be on the consumer’s terms.  By running autoplay, you might [reach your desired] fill rate, but the user is not engaged with the brand the way they would if they raised their hands to watch the video” said Justin Festa, chief digital officer for Little Things, at JW Player’s JW Insights event in New York

Higher Intelligence

The digital publisher today is going to have to use higher intelligence with consumers. A surgical approach to utilizing data and then presenting it is now a must have. So what is the benefit of artificial intelligence in video? It is better to start with the question: What is digital video? If we break it down, digital video is just a series of images and sequences spliced together. Humans are visual and have emotional responses to images and context. The story is a major draw in creating greater emotional response over simply the affinity one may have to the people. Now a computer that translates all the above and puts it into context would have to be truly intelligent. This is not something new; Netflix proved you get higher take rates by having the right images, which results in higher consumer engagement.

In the Making

KRAKEN AMP example powers by Video Machine LearningThree years ago, a technology was introduced called KRAKEN.  It utilizes video machine learning to select images to replace the static non-intelligent thumbnail with interactive dynamic thumbnails which are the best set of images to drive the highest play rates possible. The rotation of images provides more visual information when compared to a single image. Video clipping (GIF) was next, however, it is most effective in action shots. A new way of looking at video thumbnails was required. The solution was creating a real time responsive, dynamic intelligence and scoring images based on relevance. Finding the best images is one thing, however, powering video recommendation was a natural fit for finding great images.  Learning what collective visuals work together to extend longer time on site is a major deal for all publishers. We’re living in exciting times with advances in machine learning and computer chip design having achieved amazing levels of image processing capability. We have experienced a big leap forward in the code foundation (like Deep Learning) now powering platforms to segment out objects, images, places and facial recognition. We’re in an artificial intelligence renaissance.

Show me the money

Video Recommendation Powered by KRAKEN Video Machine Learning

Video Recommendation powered by KRAKEN video machine learning. Going beyond meta data and plays to now visuals within the Video.

It’s no secret ads still drive the bulk of digital video revenue. For that very reason, each video play, and increased time on site, translates into cold hard cash. Making the site sticky and getting more repeat visits requires video intelligence. Google and Apple are very serious about protecting the mobile web. It is clear that Google AMP (accelerated mobile pages) has won out with the publishers while Facebook instant articles has fallen short and most have abandoned it due to lack of making money vs AMP. The perfect trifecta of real-time video analytics, intelligence image selection, and video recommendation are now a reality. We have the data and processing power to predict what images make you excited and what video is most relevant to watch. Video discovery is key for increasing video life time value.

Conclusion

Are you ready for the do not track and the non-autoplay world?  Like it or not, Google and Apple are disabling video autoplay and intrusive ads. The digital broadcasting publisher has a grand opportunity to leverage machine learning in video. Tapping into visually relevant actions and drawing out behavior is a competitive advantage. Machine learning linked with digital video that maximizes your video assets is a strategic advantage and increases video lifetime value. The above video recommendation example was not possible before machine learning based video processing made it a reality. What possibilities can you imagine? .

For OTT, Machine Learning Image is Worth More Than a Thousand Words

So, you’ve developed an OTT app and you’ve marketed it to your viewers.  Now your focus is on keeping your viewers watching.  How can machine learning drive more engagement? Let’s face it—they may have a favorite show or two, but to keep them engaged for the long term, they need to be able to discover new shows. Discovery InfiniGraph KRAKEN Video Machine LearningBecause OTT is watched on TVs, you have a lot of real estate to engage with your viewers.  A video’s thumbnail has more of an impact on OTT than any other platform, so choose your thumbnails carefully!

Discovery is different on different platforms

On desktop, most videos start with either a search (e.g. Google) or via a social share (e.g. Facebook).  Headlines and articles provide additional info to get a viewer to cognitively commit to watching a video.  Autoplay runs rampant removing the decision to press “play” from the user.

TVs have a lot more real estate than smartphones

TVs have a lot more real estate than smartphones

On a smartphone, small screen size is an issue.  InfiniGraph’s machine learning data shows that more than three objects in a thumbnail will cause a reduction in play rates.  Again, social plays a huge role in the discovery of new content, with some publishers reporting that almost half of their mobile traffic originates from Facebook.

OTT Discovery is Unique

The discovery process on OTT is unique because the OTT experience is unique.  Most viewers already have something in mind when they turn on their OTT device.  In fact, Hulu claims that they can predict with a 70% accuracy the top three shows each of their users is tuning in to see.  But what about the other 30%?  What about the discovery of new shows?

Netflix AB Test Example

Netflix AB Test Example

Netflix has said that if a user can’t find something to watch in 30 seconds, they’ll leave the platform.  They decided to start A/B testing their thumbnails to see what impact it would have, and discovered that different audiences engage with different images.  They were able to increase view rates by 20-30% for some videos by using better images!  In the on-demand world of OTT, the right image is the difference between a satisfied viewer and a user who abandons your platform. If you’re interested in increasing engagement on your OTT app, reach out to us at InfiniGraph to learn more about our machine learning technology named KRAKEN that chooses the best images for the right audience, every single time.  Also, check out our post about increasing your video ad inventory!

More on machine learning powered image selection and driving more video views.

Making More Donuts

Being a publisher is a tough gig these days.   It’s become a complex world for even the most sophisticated companies.  And the curve balls keep coming.  Consider just a few of the challenges that face your average publisher today:

  • Ad blocking.
  • Viewability and measurement.
  • Decreasing display rates married with audience migration to mobile with even lower CPMs.
  • Maturing traffic growth on O&O sites.
  • Pressure to build an audience on social platforms including adding headcount to do so (Snapchat) without any certainty that it will be sufficiently monetizable.
  • The sad realization that native ads—last year’s savior!–are inefficient to produce, difficult to scale and are not easily renewable with advertising partners.  

The list goes on…

The Challenge

Of course, the biggest opportunity—and challenge–for publishers is video.  Nothing shows more promise for publishers from both a user engagement and business perspective than (mobile) video. It’s a simple formula.  When users watch more video on a publisher’s site, they are, by definition, more engaged.  More video engagement drives better “time spent’ numbers and, of course,  higher CPMs.    

But the barrier to entry is high, particularly for legacy print publishers. They struggle to convert readers to viewers because creating a consistently high volume of quality video content is expensive and not necessarily a part of their core DNA.  Don’t get me InfiniGraph Video Machine Learning Challenge Opportunitywrong.  They are certainly creating compelling video, but they have not yet been able to produce it at enough scale to satisfy their audiences.  At the other end of the spectrum, video-centric publishers like TV networks that live and breathe video run out of inventory on a continuous basis.   

The combined result of publishers’ challenge of keeping up with the consumer demand for quality video is a collective dearth of quality video supply in the market.  To put it in culinary terms, premium publishers would sell more donuts if they could, but they just can’t bake enough to satisfy the demand.  

So how can you make more donuts?
Trust and empower the user! 

InfiniGraph Video Machine Learning Donuts

Rise of  Artificial Intelligence

The majority of the buzz at CES this year was about Artificial Intelligence and Machine Learning.  The potential for Amazon’s Alexa to enhance the home experience was the shining example of this.  In speaking with several seasoned media executives about the AI/machine learning phenomenon, however, I heard a common refrain:  “The stuff is cool, but I’m not seeing any real applications for my business yet.”  Everyone is pining to figure out a way to unlock user preferences through machine learning in practical ways that they can scale and monetize for their businesses.  It is truly the new Holy Grail.

The Solution

That’s why we at InfiniGraph are so excited about our product KRAKEN.  KRAKEN has an immediate and profound impact on video publishing.  KRAKEN lets users curate the thumbnails publishers serve and optimizes towards user preference through machine learning in real time. The result?:  KRAKEN increases click-to-play rates by 30% on average resulting in the corresponding additional inventory and revenues.     

It is a revolutionary application of machine learning that, in execution, makes a one-InfiniGraph Video Machine Learning Brain Machineway, dictatorial publishing style an instant relic. With KRAKEN, the users literally collaborate with the publisher on what images they find most engaging.  KRAKEN actually helps you, the publisher, become more responsive to your audience. It’s a better experience and outcome for everyone.  

The Future…Now!

In a world of cool gadgets and futuristic musings, KRAKEN works today in tangible and measurable ways to improve your engagement with your audience.  Most importantly, KRAKEN accomplishes this with your current video assets. No disruptive change to your publishing flow. No need to add resources to create more video. Just a machine learning tool that maximizes your video footprint.  

In essence, you don’t need to make more donuts.  You simply get to serve more of them to your audience.  And, KRAKEN does that for you!

 

For more information about InfiniGraph, you can contact me at tom.morrissy@infinigraph.com or read my last blog post  AdTech? Think “User Tech” For a Better Video Experience

 

How Deep Learning Video Sequence Drives Profits

Beyond the deep learning hype, digital video sequence (clipping) powered by machine learning is driving higher profits. Video publishers use various images (thumbnails – poster images) to attract readers to watch more video. These “Thumbnail Images” are critical, and the visual information has a great impact on video performance. The lead visual in many cases is more important than the headline. More view equals more revenue it’s that simple. Deep learning is having significant impact in video visual search to video optimization. Here we explore video sequencing and the power of deep learning.

Having great content is required, but if your audience isn’t watching the video then you’re losing money. Understanding what images resonate with your audience and produce higher watch rates is exactly what KRAKEN does. That’s right: show the right image, sequence or clip to your consumers and you’ll increase the number of videos played. This is proven and measurable behavior as outlined in our case studies. An image is really worth a thousand words.

Below are live examples of KRAKEN in action. Each form is powered by a machine learning selection process. Below we describe the use cases for apex image, image rotation and animation clip.

Animation Clip:

KRAKEN “clips” the video at the point of APEX. Sequences are put together creating a full animation of a scene(s). Boost rates are equal to those from image rotation and can be much higher depending on the content type.

  • PROS
    • Consumer created clipping points within video
    • Creates more visual information vs. a static image
    • Highlights action scenes
    • Great for mobile and OTT preview
  • CONS:
    • More than one on page can cause distraction
    • Overuse can turn off consumers
    • Too many on page can slow page loading performance (due to size)
    • Mobile LTE is slow and can lead to choppy images instead of a smooth video

Image Rotation:

Image rotation allows for a more complete visual story to be told when compared to a static image. This results in consumers having a better idea of the content in the video. KRAKEN determines the top four most engaging images and then cycles through them. We are seeing mobile video boost rates above 50%.

  • PROS:
    • Smooth visual transition
    • Consumer selected top images
    • Creates a visual story vs. one image to engage more consumers
    • Ideal for mobile and OTT
    • Less bandwidth intensive (Mobile LTE)
  • CONS:
    • Similar to animated clips, publishers should limit multiple placements on a single page

Apex Image:

KRAKEN always finds the best lead image for any placement. This apex image alone creates high levels of play rates, especially in a click-to-launch placement. Average boost rates are between 20% to 30%.

  • PROS:
    • Audience-chosen top image for each placement
    • Can be placed everywhere (including social media)
    • Ideal for desktop
    • Good with mobile and OTT
  • CONS:
    • Static thumbnails have limited visual information
    • Once the apex is found, the image will never be substituted

Below are live KRAKEN animation clip examples. All three animations start with the audience choosing the apex image.  Then, KRAKEN identifies (via deep learning) clipping points and uses machine learning to adjust to optimal clipping sequence.

HitFix Video Deep Learning Video Clipping to Action Machine Learning

HitFix Video Deep Learning Video Clipping to Action, Machine Learning adjust in real time

Video players have transitioned to HTML5 and mobile consumption of video is the fastest growing medium. Broadcasters that embrace advanced technologies that adapt to the consumer preference will achieve higher returns, and at the same time create a better consumer experience. The value proposition is simple: If you boost your video performance by 30% (for a video publisher doing 30 million video plays per month), KRAKEN will drive an additional $2.2 million in revenue (See KRAKEN revenue calculator). This happens with existing video inventory and without additional head count. KRAKEN creates a win-win scenario and will improve its performance as more insights are used to bring prediction and recommendation to consumers, thereby increasing the video process.

How Deep Learning Powers Visual Search

The elusive video search whereby you can search video image context is now possible with advanced technologies like deep learning. It’s very exciting to see video SEO becoming a reality thanks to amazing algorithms and massive computing power. We truly can say a picture is worth 1,000 words!

Content creators have fantasized about doing video search. For many years,, major engineering challenges were a road block to comprehending video images directly.

Originally posted on SEJ

Video visual search opens up a whole new field where video is the new HTML. And, the new visual SEO is what’s in the image. We’re in exciting times with new companies dedicated to video visual search. In a previous post, Video Machine Learning: A Content Marketing Revolution, we demonstrated image analysis within video to improve video performance. After one year, we’re now embarking on video visual search via deep learning.

Behind the Deep Curtain

Video Deep Learning  KRAKEN wonder-woman-trailer

Video clipping powered by KRAKEN video deep learning. Identify relevance within video images to drive higher plays

Many research groups have collaborated to push the field of deep learning forward. Using an advanced image labeling repository like ImageNet has elevated the deep learning field. The ability to take video and identify what’s in the video frames and apply description opens up huge visual keywords.

What is deep learning? It is probably the biggest buzzword around along with AI (Artificial Intelligence). Deep Learning came from advanced math on large data set processing, similar to the way the human brain works. The human brain is made of up tons of neurons and we have long attempted to mimic how these neurons work. Previously, only humans and a few other animals had the ability to do what machines can now do. This is a game changer.

The evolution of what’s call a Convolution Neural Network, or CNN aka deep learning, was created from thought leaders like Yann LeCrun (Facebook), Geoffrey Hinton (Google), Andrew Ng (Baidu) and Li Fei-Fei (Director of the Stanford AI Lab and creator of ImageNet). Now the field has exploded and all major companies have open sourced their deep learning platforms for running Convolution Neural Networks in various forms. In an interview with New York Times, Fei-Fei said “I consider the pixel data in images and video to be the dark matter of the Internet. We are now starting to illuminate it.” That was back in 2014. For more on the history of machine learning, see the post by Roger Parloff at Fortune.

Big Numbers

KRAKEN video deep learning Images for high video engagement

KRAKEN video deep learning Images for high video engagement

Image reduction is key to video deep learning. Image analysis is achieved through big number crunching. Photo: Chase McMichael created image

Think about this: video is a collection of images linked together and played back at 30 frames-a-second. Analyzing massive number of frames is a major challenge

As humans, we see video all the time and our brains are processing those images in real-time. Getting a machine to do this very task at scale is not trivial. Machines processing images is an amazing feat and doing this task in real-time video is even harder. You must decipher shapes, symbols, objects, and meaning. For robotics and self-driving cars this is the holy grail.

To create a video image classification system required a slightly different approach. You must handle the enormous number of single frames in a video file first to understand what’s in the images.

Visual Search

On September 28th, 2016, the seven-member Google research team announced YouTube-8M leveraging state-of-the-art deep learning models. YouTube-8M, consists of 8 million YouTube videos, equivalent to 500K hours of video, all labeled and there are 4800 Knowledge Graph entities. This is a big deal for the video deep learning space. YouTube-8M’s scale required some pre-processing on images to pull frame level features first. The team used Inception-V3 image annotation model trained on ImageNet. What’s makes this such a great thing is we now have access to a very large video labeling system and Google did massive heavy lifting to create 8M.

Google 8M Stats Video Visual Search

Top level numbers of YouTube 8M. Photo created by Chase McMichael.

Top level numbers of YouTube 8M. Photo created by Chase McMichael.

The secret to handling all this big data was reducing the number of frames to be processed. The key is extracting frame level features from 1 frame-per-second creating a manageable data set. This resulted in 1.9 billion video frames enabling a reasonable handling of data. With this size you can train a TensorFlow model on a single Graphic Process Unit (GPU) in 1 day! In comparison, the 8M would have required a petabyte of video storage and 24 CPUs of computing power for a year. It’s easy to see why pre-processing was required to do video image analysis and frame segmenting created a manageable data set.

Big Deep Learning Opportunity

 

Chase mcMichael Deep Learning Talk to ACM Reinforced Deep Learning Vidoe

Chase McMichael gives talk on video hacking to ACM Aug 29th Photo: Sophia Viklund used with permission

Google has beautifully created two big parts of the video deep learning trifecta. First, they opened up a video based labeling system (YouTube8m). This will give all in the industry a leg up in analyzing video. Without a labeling system like ImageNet, you would have to do the insane visual analysis on your own. Second, Google opened Tensoflow, their deep learning platform, creating a perfect storm for video deep learning to take off. This is why some call it an artificial intelligence renaissance. Third, we have access to a big data pipeline. For Google this is easy, as they have YouTube. Companies that are creating large amounts of video or user-generated videos will greatly benefit.

The deep learning code and hardware are becoming democratized, and its all about the visual pipeline. Having access to a robust data pipeline is the differentiation. Companies that have the data pipeline will create a competitive advantage from this trifecta.

Big Start

Follow Google’s lead with TensorFlow, Facebook launched it’s own open AI platform FAIR, followed by Baidu. What does this all mean? The visual information disruption is in full motion. We’re in a unique time where machines can see and think. This is the next wave of computing. Video SEO powered by deep learning is on track to be what keywords are to HTML.

Visual search is driving opportunity and lowering technology costs to propel innovation. Video discovery is not bound by what’s in a video description (meta layer). The use cases around deep learning include medical image processing to self-flying drones, and that is just a start.

Deep learning will have a profound impact our daily lives in ways we never imagined.

Both Instagram and Snapchat are using sticker overlays based on facial recognition and Google Photo sort your photos better than any app out there. Now we’re seeing purchases linked with object recognition at Houzz leveraging product identification powered by deep learning. The future is bright for deep learning and content creation. Very soon we’ll be seeing artificial intelligence producing and editing video.

How do you see video visual search benefiting you, and what exciting use cases can you imagine?

Feature Image is YouTube 8M web interface screen shot taken by Chase McMichael on September 30th .

Hacking Digital Video Via Deep Learning, A Video Machine Learning Solution


Chase McMichael spoke at the ACM Bay Area Chapter Event on September 29th.

Intro to the Video Deep Learning Talk

Deep Learning, image and object recognition are core elements to intelligent video visual analysis. Understanding context within and classification creates a strong use case for video deep learning. Digital video is exploding, however there are few leveraging the wealth of data and how to harness visual analysis. A true reinforced deep learning system using collective human intelligence linked with neural networks provides the foundation to a new level of video insights. We’re just at the beginnings of intelligent video and using this knowledge to improve video performance.

kraken-gif-example-sportsphelps-kraken

Chase McMichael talk at ACM on Hacking Video Via Deep Learning

Chase McMichael talk at ACM on Hacking Video Via Deep Learning Photo: Sophia Viklund