How a Google Engineer Used Her AI Smarts to Create the Ultimate Family Archive – PCMag UK

(Image: Getty)

COVID-19 lockdowns perhaps gave a few of you some time to organize old photos that have been languishing on SD cards or in boxes, but how many of you built an AI-powered searchable archive of family videos from almost 500 hours of footage?

Dale Markowitz, an Applied AI Engineer and Developer Advocate at Google, did just that. The Texas-based Princeton grad took hours of disorganized, miniDV tape footage housed on Google Drive and turned it into an archive "that let me search my family videos by memories, not timestamps," she wrote in a July blog post. It was the ultimate Father's Day gift.

We spoke with Markowitz recently to find out how machine learning helped her get it done, but why AI is only one part of the puzzle when it comes to solving complex problems.

Although this project used a raft of Google tools, which well get to, it was actually not for the day job, but the coolest Father's Day gift, right?[DM] At Google, I spend lots of time trying to think up new use cases for AI and build prototypes focused on the more business-y side. But I always wanted to work on more fun, zany stuff and, with quarantine, I finally had SO MUCH TIME. So, yes, this one was a gift for my dadwho, by the way, is also a huge programmer nerd who works in machine learning.

As your dad works in machine learning, he would totally get what it took to build it out. Let's go "under the hood" with the details.[DM] Sure. So I uploaded all of my dad's videos to a cloud storage bucket and then analyzed them with the Video Intelligence API, which returns JSON. Basically, the API does all the heavy lifting including: detecting scene changes; extracting text and timestamps on screen using computer vision; transcribing audio; tagging objects and scenes in images; and so on.

Because you needed to apply intelligence to what was probably hours of untagged material, right?[DM] Exactly. When my dad recorded on miniDV, the clips werent saved into separate files. They'd all be smashed into one long, three-hour recording, separated by little flashes of black and white. The API was able to pick out where those clip boundaries should have been.

Regarding audio transcription, that must have helped in tagging, categorizing, and identifying what was on all those miniDVs.[DM] Yes, and I found this to be the coolest part of the project, because it let me search for hyper specific things like "Pokemon" or "Gameboy." Also, my dad was a big video narrator, so I could search his commentary for milestones.

As an applied AI engineer, you're experienced in this field, but others using the API won't need to be up on machine learning, right? Essentially, it's not quite, but almost, out-of-the-box in terms of building out the metadata and intelligence?[DM] Confirmed. You dont need any ML expertise to build out this project. Its very developer-friendly. Having said that, there was one more AI part of this project, which was implementing search. I wanted to be able to search through all those transcripts, scene labels, and objects labels, but I didnt want to have to exactly match the words.

Because you needed a proper semantic search layer for this project?[DM] Exactly. I wanted to allow for near-matches and misspellings and even matching synonyms, such as treating the word trash the same as garbage." As you know, in semantic search," you want an algorithm that understands the semantic meaning of what youre saying regardless of the specific words and spellings you use. For that, I used a great Search as a Service tool called Algolia. I uploaded all my records (as JSON) and Algolia provided me with a smart semantic search endpoint to query those records.

Obviously, youve got a corporate account as a Googler to use all these tools. But what would the cost be for a non-Googler to do this? And are you sharing your GitHub codeyour GitHub code so people can replicate this?[DM] Yep, the code is all open source. Though I should add that a lot of these features are available through Google Photos, which works with videos too, apart from the ability to search transcripts. Cost-wise, I analyzed 126GB of video (about 36 hours) and my total cost was $300. I know that seems high, but it turns out the bulk of the cost came from one single type of analysisdetecting on-screen text. Everything else amounted to just $80. As on-screen text was the least interesting attribute I extracted, I recommend leaving that out unless you really need it. Also, the first 1,000 minutes of video falls in the Google Cloud free tier. Besides the ML parts, storing my data in Algolia runs me around $50 a month for around 90,000 JSON objects. But I havent done much optimizing, and they do have a free tier.

Youre the overall host on YouTube for the new series Making with Machine LearningMaking with Machine Learning." Whats up next there in terms of projects?[DM] Machine-generated recipes, automatically dubbed videos, and an AI dash cam. Well, if I can get those things to workI never really know until I start building them. Another thing Ive been fascinated with lately are ways to do machine learning with little or no data, and zero-shot learning. More on that coming soon.

Well look out for those. Now lets do some background on you: What drew you to study computer science and why specifically at Princeton?[DM] I originally decided to go to Princeton because I wanted to be a theoretical physicist, and I really admired Professor Richard Feynman when I was in high school.But back in 2013, when I was a sophomore in college, it really felt like computer science was the place to be: everything was developing so quicklyArduino, AI, brain-computer interfaces. In retrospect, though I didnt know it then, majoring in computer science was a great decision, because theres almost no field, scientific or otherwise, that hasnt benefited from machine learning. In fact, sometimes it seems like some of the most cutting-edge work in biology and neuroscience and physics is coming from ML.

Whats caught your eye recently in terms of ML?[DM] Specifically within the field of biophysics, Id say DeepMinds new protein folding model, AlphaFold 2, is a great example of ML.

You worked as a researcher on brain-machine interfaces to measure sustained attention. Can you give us a brief explanation of what you were doing there?[DM] In that lab, some researchers had discovered they could (roughly) measure attention by having people do an extremely mundane task in an fMRI machine and then analyzing their brain scans. They were actually using deep learning, which was pretty revolutionary in neuroscience at the time. The problem is that fMRI machines are extremely expensive. I was investigating whether you could get similar results using an EEG machine (which is much cheaper), and specifically a portable, wireless EEG (which is much much cheaper). The results were mixed, but I think, since then, portable EEG machines have gotten better at taking clear readings, and I have gotten better at machine learning.

You moved from data science to applied AI and your focus is on how people can apply AI, ML, etc. But do you also interface with the more theoretical AI people at Google too or only tangentially?[DM] There is a pretty tight relationship between Google Cloud and Google Research. The field changes so quickly that there has to be. When a splashy research paper comes out, it takes almost no time before customers start asking how to get it on Google Cloud. One good example is around explainability and responsible AI. Now that machine learning is becoming more accessible, more folks can build their own models. But how do you know those models are accurate? How do you know you can trust them, and that they wont make predictions that are embarrassing or offensive? The answer is closely linked to explainability, our ability to understand why models make the predictions they doi.e. its hard to trust black box models.

Yeah, theres a big push for explainable AIexplainable AI right now.[DM] This is a tough problem, and an active area of research across Google. But weve been working very closely with Google Research to add explainability into our customer-facing products.

At Google I/OGoogle I/O 2019, you focused on democratizing AIallowing developers to use Googles AI tools, like AutoML, and off-the-shelf APIs to create cool stuff. Tell us more about that. [DM] ML has gotten way easier and more accessible for developers over the past five years. And one of the reasons thats so exciting is because more people from different backgrounds start using it and we end up with very creative projects. Sometimes people see a project Ive built and theyll riff on it, which I think is super cool. For example, I built a tennis serve analyzer, and then some folks built a cricket and a badminton version. I saw a yoga pose detector, and someone built an AI Diary using some of the same tech as my video archive analyzer.

Thinking more broadly, it occured to me that many of your AI-powered projects are applications that could help non-neurotypical people to navigate the world. For example, you engineeredengineered an AI Stylist which could illuminate social cues and help people be workplace appropriate or situation appropriate.[DM] Interesting. On one hand, there are definitely great applications of AI for non-neurotypical folks. The most compelling one Ive heard of involves using computer vision to understand facial expressions and emotions. On the flip side, I try to avoid using machine learning in situations where the result of a mistake is catastrophic.

On that note, when I interviewed Dr. Janelle Shaneinterviewed Dr. Janelle Shane, she had some bizarre brownie recipes generated by one of her AIs, because that stuff is harder than most people imagine. For example, AI doesnt have common sense," so you had to build in rules that a human wouldnt need - i.e. I need two shoes, a left one and a right one, but only one shirt or hat." Any wardrobe mishaps with the stylist before it got it right?[DM] Oh yes, 100%. Furthermore, I would say using a combination of ML and human rules is a pretty good design pattern. One mistake I see people make a lot is try to completely, end-to-end solve a problem with AI. Its better to use ML only for the parts of your system that really need it, such as recognizing a clothing item from an image. But then writing simple rules in places where ML isnt necessarysuch as combining clothing items to make an outfit. Human rulesi.e. An outfit contains exactly two shoesare usually easier to understand, debug, and maintain than ML models. One thing that seemed to trip up the stylist app was that I took a bunch of pictures of clothing on mannequins; my vision model was trained on pictures of people, not mannequins.

The vision model which was looking for humans not static clothes horses?[DM] Yup. That really tricked the model. It was convinced the mannequin was a suitcase or something. By the way, I published the code on GitHub if others want to try it out.

At Google I/O, you also talked about the custom sentiment analysis using natural language. Has that been deployed into something cool like a concurrent translator that can detect irony or emotioni.e. good for non-native speakers while on business trips abroadif we ever get to do those again?[DM] Interesting idea. Were still struggling with irony detection in NLP. But can you really blame a computer for not recognizing irony when lots of humans cant, either?

Good point.[DM] I also suspect irony is largely contextuali.e. text paired with an image, or spoken in a particular way, which makes the problem more challenging. Detecting emotion from speech is a cool idea. But Id probably opt not to analyze just the words the person is saying (text sentiment) and focus more on their intonation. Sounds like a neat project. But like many ML problems, the challenge is finding a good training dataset.

True. So, wrapping it up, do you see the AI tools that youre working with now are a way of building a smart layer between IRL and our silicon cousins (embodied/non-bodied AIs)? For example, when I interviewed AI researcher Dr. Justin Liinterviewed AI researcher Dr. Justin Li, we talked about AI being able to anticipate our needs before we know we have them. [DM] In the future, yes, I think humans and AIs will work closely together. But for me whats most compelling are use cases where machine learning models are uniquely well-suited to do something that humans cant do or arent good at. For example, people make really good assistants and companions and teachers, but theyre not very good at processing millions of web pages in seconds or discovering exoplanets or predicting how proteins fold. So its in these applications, I believe, that AI can make the most impact.

Protein folding - Wikipedia [Last Updated On: August 18th, 2024] [Originally Added On: September 11th, 2019]
Protein Folding: The Good, the Bad, and the Ugly - Science ... [Last Updated On: August 18th, 2024] [Originally Added On: September 13th, 2019]
Protein Folding - Chemistry LibreTexts [Last Updated On: August 18th, 2024] [Originally Added On: September 14th, 2019]
Protein Structure and Folding [Last Updated On: August 18th, 2024] [Originally Added On: September 19th, 2019]
Structural Biochemistry/Proteins/Protein Folding ... [Last Updated On: August 18th, 2024] [Originally Added On: September 28th, 2019]
Proteopathy - Wikipedia [Last Updated On: August 18th, 2024] [Originally Added On: October 1st, 2019]
Folding@home - Wikipedia [Last Updated On: August 18th, 2024] [Originally Added On: October 1st, 2019]
Denaturation and Protein Folding | Introduction to Chemistry [Last Updated On: August 18th, 2024] [Originally Added On: October 4th, 2019]
Protein Folding - Anfinsen's Experiment ~ Biology Exams 4 U [Last Updated On: August 18th, 2024] [Originally Added On: October 4th, 2019]
Protein Structures: Primary, Secondary, Tertiary, Quaternary ... [Last Updated On: August 18th, 2024] [Originally Added On: October 4th, 2019]
Protein Folding - an overview | ScienceDirect Topics [Last Updated On: August 18th, 2024] [Originally Added On: October 6th, 2019]
Thermodynamics of spontaneous protein folding: role of ... [Last Updated On: August 18th, 2024] [Originally Added On: October 8th, 2019]
Molecular Biology 02: 'Thermodynamics of protein folding' [Last Updated On: August 18th, 2024] [Originally Added On: October 8th, 2019]
The Science Behind Foldit | Foldit [Last Updated On: August 18th, 2024] [Originally Added On: October 8th, 2019]
Diseases Folding@home [Last Updated On: August 18th, 2024] [Originally Added On: October 8th, 2019]
DeepMind timeline: The history of the UK's pioneering AI firm - Techworld.com [Last Updated On: August 18th, 2024] [Originally Added On: October 10th, 2019]
Geroscience and it's Impact on the Human Healthspan: A podcast with John Newman - GeriPal - A Geriatrics and Palliative Care Blog [Last Updated On: August 18th, 2024] [Originally Added On: October 10th, 2019]
Yumanity Therapeutics Initiates Phase 1 Clinical Trial of Lead Candidate YTX-7739 for the Treatment of Parkinson's Disease | Small Molecules | News... [Last Updated On: August 18th, 2024] [Originally Added On: October 10th, 2019]
Tenure-Track or Tenure-Eligible Position in the Laboratory of Chemical Physics job with National Institutes of Health | 28302 - Chemical &... [Last Updated On: August 18th, 2024] [Originally Added On: October 10th, 2019]
Food for the soul: Traditional gyza makers and eaters in Utsunomiya try to keep the dumplings rolling - The Japan Times [Last Updated On: August 18th, 2024] [Originally Added On: October 19th, 2019]
UT molecular evolution professor named 2019 American Physical Society Fellow - UT The Daily Texan [Last Updated On: August 18th, 2024] [Originally Added On: October 19th, 2019]
9 must-have Instant Pot accessories for healthy eating - CNET [Last Updated On: August 18th, 2024] [Originally Added On: October 19th, 2019]
Researchers Find Fish Wearing Natural 'Bullet-Proof Vest' to Thwart Piranhas in Amazon - News18 [Last Updated On: August 18th, 2024] [Originally Added On: October 19th, 2019]
Christopher Dobson: chemist whose work on proteins advanced research into neurodegenerative diseases - The BMJ [Last Updated On: August 18th, 2024] [Originally Added On: October 19th, 2019]
Two years in the making, Pizza Hut tests a round pizza box - Fast Company [Last Updated On: August 18th, 2024] [Originally Added On: October 23rd, 2019]
Fava Is All About Balance - East Bay Express [Last Updated On: August 18th, 2024] [Originally Added On: October 23rd, 2019]
Amazon fish wears nature's 'bullet-proof vest' to thwart piranhas - Reuters [Last Updated On: August 18th, 2024] [Originally Added On: October 23rd, 2019]
RNA Folding Insights Lead to New Therapeutics and Synthetic Biology Technologies - Technology Networks [Last Updated On: August 18th, 2024] [Originally Added On: October 23rd, 2019]
The Hidden Inactive Ingredient: Biological Products in Recombinant Pharmaceuticals - P&T Community [Last Updated On: August 18th, 2024] [Originally Added On: October 23rd, 2019]
Insights into Parkinson's Onset May Lie in New Model of Cell Aging and Damage - Parkinson's News Today [Last Updated On: August 18th, 2024] [Originally Added On: October 23rd, 2019]
Antibiotics with novel mechanism of action discovered - Drug Target Review [Last Updated On: August 18th, 2024] [Originally Added On: October 25th, 2019]
The top AI lighthouse projects to watch in biopharma - FierceBiotech [Last Updated On: August 18th, 2024] [Originally Added On: October 25th, 2019]
UCI vision scientist Krzysztof Palczewski elected to National Academy of Medicine - UCI News [Last Updated On: August 18th, 2024] [Originally Added On: October 26th, 2019]
Rett Syndrome Tied to Altered Protein Levels in Brain in Early Study - Rett Syndrome News [Last Updated On: August 18th, 2024] [Originally Added On: October 26th, 2019]
Bulls-Eye: Imaging Technology Could Confirm When a Drug Is Going to the Right Place - On Cancer - Memorial Sloan Kettering [Last Updated On: August 18th, 2024] [Originally Added On: October 26th, 2019]
Discover: Science is often wrong and that's actually a really good thing - Sudbury.com [Last Updated On: August 18th, 2024] [Originally Added On: November 2nd, 2019]
IBM vs. Google and the Race to Quantum Supremacy - Citizen Truth [Last Updated On: August 18th, 2024] [Originally Added On: November 7th, 2019]
Microprotein ID'd Affecting Protein Folding and Cell Stress Linked to Diseases Like Huntington's, Study Finds - Huntington's Disease News [Last Updated On: August 18th, 2024] [Originally Added On: November 7th, 2019]
IBM vs. Google and the race to quantum supremacy - Salon [Last Updated On: August 18th, 2024] [Originally Added On: November 11th, 2019]
That Junk DNA Is Full of Information! - Advanced Science News [Last Updated On: August 18th, 2024] [Originally Added On: November 16th, 2019]
Argonne Researchers to Share Scientific Computing Insights at SC19 - HPCwire [Last Updated On: August 18th, 2024] [Originally Added On: November 16th, 2019]
How to Make the Most of Your Old Tech - New York Magazine [Last Updated On: August 18th, 2024] [Originally Added On: November 16th, 2019]
2 tricked-out pies to be thankful for: pear with cranberries and pumpkin with ginger praline - The Gazette [Last Updated On: August 18th, 2024] [Originally Added On: November 17th, 2019]
From Mediterranean Lentil Salad to Cinnamon Raisin Bread: Our Top 10 Vegan Recipes of the Day! - One Green Planet [Last Updated On: August 18th, 2024] [Originally Added On: November 22nd, 2019]
What is Biophysical Analysis? - The John Innes Centre [Last Updated On: August 18th, 2024] [Originally Added On: November 22nd, 2019]
Thermodynamic probes of instability: application to therapeutic proteins - European Pharmaceutical Review [Last Updated On: August 18th, 2024] [Originally Added On: November 22nd, 2019]
In science, its better to be curious than correct - The Conversation CA [Last Updated On: August 18th, 2024] [Originally Added On: November 22nd, 2019]
New Study Reveals US Airlines With the Healthiest Food Options - TravelPulse [Last Updated On: August 18th, 2024] [Originally Added On: November 29th, 2019]
Study Reveals Hepatitis A Originated in Insects - Advanced Science News [Last Updated On: August 18th, 2024] [Originally Added On: November 29th, 2019]
How Home-Baked Bread Is Defying the Industrial Food System - YES! Magazine [Last Updated On: August 18th, 2024] [Originally Added On: November 29th, 2019]
Black Friday Is Absolutely Massive. Here Are a Bunch of Deals We Couldn't Call Out Individually - Gear Patrol [Last Updated On: August 18th, 2024] [Originally Added On: November 30th, 2019]
A conserved ATP- and Scc2/4-dependent activity for cohesin in tethering DNA molecules - Science Advances [Last Updated On: August 18th, 2024] [Originally Added On: November 30th, 2019]
Ancient Worm Reveals Way to Destroy Toxic Cells Potential New Therapy for Huntingtons and Parkinsons - SciTechDaily [Last Updated On: August 18th, 2024] [Originally Added On: December 11th, 2019]
Biologics Market Size Expand at a CAGR of 3.9 With $399.5 Billion By 2025 - MENAFN.COM [Last Updated On: August 18th, 2024] [Originally Added On: December 11th, 2019]
Exploring the Diversity of Parkinson's Proteins - Technology Networks [Last Updated On: August 18th, 2024] [Originally Added On: December 11th, 2019]
Early detection of brain degeneration on the horizon with innovative sensor - UNM Newsroom [Last Updated On: August 18th, 2024] [Originally Added On: December 16th, 2019]
Holiday cookies from around the world | Features - yoursun.com [Last Updated On: August 18th, 2024] [Originally Added On: December 22nd, 2019]
The Art of Origami is Now A Key Tool That Helps Doctors Save Lives - Nature World News [Last Updated On: August 18th, 2024] [Originally Added On: December 23rd, 2019]
Nanopores can identify the amino acids in proteins, the first step to sequencing - University of Illinois News [Last Updated On: August 18th, 2024] [Originally Added On: December 23rd, 2019]
Wow your New Year's Eve guests with a puff pastry appetizer - KARE11.com [Last Updated On: August 18th, 2024] [Originally Added On: January 1st, 2020]
The 10 most compelling product innovations of 2019 - Fast Company [Last Updated On: August 18th, 2024] [Originally Added On: January 1st, 2020]
Our best recipes from 2019 | Food and cooking - STLtoday.com [Last Updated On: August 18th, 2024] [Originally Added On: January 1st, 2020]
The best WIRED long reads of 2019 - Wired.co.uk [Last Updated On: August 18th, 2024] [Originally Added On: January 1st, 2020]
Structure of Drosophila melanogaster ARC1 reveals a repurposed molecule with characteristics of retroviral Gag - Science Advances [Last Updated On: August 18th, 2024] [Originally Added On: January 1st, 2020]
Gocycle to partner with nutrition brand Fuel10k to promote benefits of e-bikes - Bike Biz [Last Updated On: August 18th, 2024] [Originally Added On: January 16th, 2020]
The Importance of Understanding TargetProtein Interactions in Drug Discovery - Technology Networks [Last Updated On: August 18th, 2024] [Originally Added On: January 16th, 2020]
How DeepMind is unlocking the secrets of dopamine and protein folding with AI - VentureBeat [Last Updated On: August 18th, 2024] [Originally Added On: January 16th, 2020]
How To Grow (Almost) Anything - Hackaday [Last Updated On: August 18th, 2024] [Originally Added On: January 19th, 2020]
U of T's Peter Wittek, who will be remembered at Feb. 3 event, on why the future is quantum - News@UofT [Last Updated On: August 18th, 2024] [Originally Added On: January 19th, 2020]
The DeepMind algorithm to solve two complex problems of biology - The Times Hub [Last Updated On: August 18th, 2024] [Originally Added On: January 19th, 2020]
High Focus on Product Innovation & Development to Assist the Growth of the Folding Cartons Market between and . 2017 2025 Dagoretti News -... [Last Updated On: August 18th, 2024] [Originally Added On: January 24th, 2020]
Folded, frozen, and faster: JUST Egg is now more convenient, and cheaper, to enjoy - FoodNavigator-USA.com [Last Updated On: August 18th, 2024] [Originally Added On: January 24th, 2020]
Phyllo, cheese, heaven: Balkan women have been making these treats for centuries - The Gazette [Last Updated On: August 18th, 2024] [Originally Added On: January 26th, 2020]
Phyllo, cheese, heaven: Balkan women have been making these treats for centuries - Waterbury Republican American [Last Updated On: August 18th, 2024] [Originally Added On: January 28th, 2020]
The keto diet: Its highs and lows plus 5 recipes - The Gazette [Last Updated On: August 18th, 2024] [Originally Added On: February 12th, 2020]
Study Shows How Soap Molecules Alter the Protein Structure - AZoM [Last Updated On: August 18th, 2024] [Originally Added On: February 12th, 2020]
CryoEM of CBD Tau Suggests Another Unique Protofibril - Alzforum [Last Updated On: August 18th, 2024] [Originally Added On: February 16th, 2020]
Working In Science Was A Brutal Education. Thats Why I Left. - BuzzFeed News [Last Updated On: August 18th, 2024] [Originally Added On: February 17th, 2020]
The Evolution of the Eye, Demystified - Discovery Institute [Last Updated On: August 18th, 2024] [Originally Added On: February 28th, 2020]
L-serine could be used to treat ALS, after promising study results - Drug Target Review [Last Updated On: August 18th, 2024] [Originally Added On: February 28th, 2020]