Microsoft and Nvidia team up to train one of the worlds largest language models – VentureBeat

The Transform Technology Summits start October 13th with Low-Code/No Code: Enabling Enterprise Agility. Register now!

Microsoft and Nvidia today announced that they trained what they claim is the largest and most capable AI-powered language model to date: Megatron-Turing Natural Language Generation (MT-NLP). The successor to the companies Turing NLG 17B and Megatron-LM models, MT-NLP contains 530 billion parameters and achieves unmatched accuracy in a broad set of natural language tasks, Microsoft and Nvidia say including reading comprehension, commonsense reasoning, and natural language inferences.

The quality and results that we have obtained today are a big step forward in the journey towards unlocking the full promise of AI in natural language. The innovations of DeepSpeed and Megatron-LM will benefit existing and future AI model development and make large AI models cheaper and faster to train, Nvidias senior director of product management and marketing for accelerated computing, Paresh Kharya, and group program manager for the Microsoft Turing team, Ali Alvi wrote in a blog post. We look forward to how MT-NLG will shape tomorrows products and motivate the community to push the boundaries of natural language processing (NLP) even further. The journey is long and far from complete, but we are excited by what is possible and what lies ahead.

In machine learning, parameters are the part of the model thats learned from historical training data. Generally speaking, in the language domain, the correlation between the number of parameters and sophistication has held up remarkably well. Language models with large numbers of parameters, more data, and more training time have been shown to acquire a richer, more nuanced understanding of language, for example gaining the ability to summarize books and even complete programming code.

To train MT-NLG, Microsoft and Nvidia say that they created a training dataset with 270 billion tokens from English-language websites. Tokens, a way of separating pieces of text into smaller units in natural language, can either be words, characters, or parts of words. Like all AI models, MT-NLP had to train by ingesting a set of examples to learn patterns among data points, like grammatical and syntactical rules.

The dataset largely came from The Pile, an 835GB collection of 22 smaller datasets created by the open source AI research effort EleutherAI. The Pile spans academic sources (e.g., Arxiv, PubMed), communities (StackExchange, Wikipedia), code repositories (Github), and more, which Microsoft and Nvidia say they curated and combined with filtered snapshots of the Common Crawl, a large collection of webpages including news stories and social media posts.

Training took place across 560 Nvidia DGX A100 servers, each containing 8 Nvidia A100 80GB GPUs.

When benchmarked, Microsoft says that MT-NLP can infer basic mathematical operations even when the symbols are badly obfuscated. While not extremely accurate, the model seems to go beyond memorization for arithmetic and manages to complete tasks containing questions that prompt it for an answer, a major challenge in NLP.

Its well-established that models like MT-NLP can amplify the biases in data on which they were trained, and indeed, Microsoft and Nvidia acknowledge that the model picks up stereotypes and biases from the [training] data. Thats likely because a portion of the dataset was sourced from communities with pervasivegender, race,physical, and religious prejudices, which curation cant completely address.

In a paper, the Middlebury Institute of International Studies Center on Terrorism, Extremism, and Counterterrorism claim that GPT-3 and similar models can generate informational and influential text that might radicalize people into far-right extremist ideologies and behaviors. A group at Georgetown University has used GPT-3 to generate misinformation, including stories around a false narrative, articles altered to push a bogus perspective, and tweets riffing on particular points of disinformation. Other studies, like one published by Intel, MIT, and Canadian AI initiative CIFAR researchers in April, have found high levels of stereotypical bias from some of the most popular open source models, including Googles BERT, XLNet,andFacebooks RoBERTa.

Microsoft and Nvidia claim that theyre committed to working on addressing [the] problem and encourage continued research to help in quantifying the bias of the model. They also say that any use of Megatron-Turing in production must ensure that proper measures are put in place to mitigate and minimize potential harm to users, and follow tenets such as those outlined in Microsofts Responsible AI Principles.

We live in a time [when] AI advancements are far outpacing Moores law. We continue to see more computation power being made available with newer generations of GPUs, interconnected at lightning speeds. At the same time, we continue to see hyper-scaling of AI models leading to better performance, with seemingly no end in sight, Kharya and Alvi continued. Marrying these two trends together are software innovations that push the boundaries of optimization and efficiency.

Projects like MT-NLP, AI21 Labs Jurassic-1, Huaweis PanGu-Alpha, Navers HyperCLOVA, and the Beijing Academy of Artificial Intelligences Wu Dao 2.0 are impressive from an academic standpoint, but building them doesnt come cheap. For example, the training dataset for OpenAIs GPT-3 one of the worlds largest language models was 45 terabytes in size, enough to fill 90 500GB hard drives.

AI training costs dropped 100-fold between 2017 and 2019, according to one source, but the totals still exceed the compute budgets of most startups. The inequity favors corporations with extraordinary access to resources at the expense of small-time entrepreneurs, cementing incumbent advantages.

For example, OpenAIs GPT-3 required an estimated 3.1423^23 floating-point operations per second (FLOPS) of compute during training. In computer science, FLOPS is a measure of raw processing performance, typically used to compare different types of hardware. Assuming OpenAI reserved 28 teraflops 28 trillion floating-point operations per second of compute across a bank of Nvidia V100 GPUs, a common GPU available through cloud services, itd take $4.6 million for a single training run. One Nvidia RTX 8000 GPU with 15 teraflops of compute would be substantially cheaper but itd take 665 years to finish the training.

Microsoft and Nvidia says that it observed between 113 to 126 teraflops per second per GPU while training MT-NLP. The cost is likely to have been in the millions of dollars.

A Synced report estimated that a fake news detection model developed by researchers at the University of Washington cost $25,000 to train, and Google spent around $6,912 to train a language model called BERT that it used to improve the quality of Google Search results. Storage costs also quickly mount when dealing with datasets at the terabyte or petabyte scale. To take an extreme example, one of the datasets accumulated by Teslas self-driving team 1.5 petabytes of video footage would cost over $67,500 to store in Azure for three months, according to CrowdStorage.

The effects of AI and machine learning model trainingon the environmenthave also been brought into relief. In June 2020, researchers at the University of Massachusetts at Amherst released a report estimating that the amount of power required for training and searching a certain model involves the emissions of roughly626,000 pounds of carbon dioxide, equivalent to nearly five times the lifetime emissions of the average U.S. car. OpenAI itself has conceded that models like Codex require significant amounts of compute on the order of hundreds of petaflops per day which contributes to carbon emissions.

In a sliver of good news, the cost for FLOPS and basic machine learning operations has been falling over the past few years. A 2020 OpenAI survey found that since 2012, the amount of compute needed to train a model to the same performance on classifying images in a popular benchmark ImageNet has been decreasing by a factor of two every 16 months. Other recent research suggests that large language models arent always more complex than smaller models, depending on the techniques used to train them.

Maria Antoniak, a natural language processing researcher and data scientist at Cornell University, says when it comes to natural language, its an open question whether larger models are the right approach. While some of the best benchmark performance scores today come from large datasets and models, the payoff from dumping enormous amounts of data into models is uncertain.

The current structure of the field is task-focused, where the community gathers together to try to solve specific problems on specific datasets, Antoniak told VentureBeat in a previous interview. These tasks are usually very structured and can have their own weaknesses, so while they help our field move forward in some ways, they can also constrain us. Large models perform well on these tasks, but whether these tasks can ultimately lead us to any true language understanding is up for debate.

See the original post here:
Microsoft and Nvidia team up to train one of the worlds largest language models - VentureBeat

Best Predictive Analytics Software for 2021 – CIO Insight

Todays companies generate vast amounts of data, representing a valuable opportunity to become resilient, avoid mistakes, and more accurately address customers needs. Predictive analytics software turns data into insights companies can act upon, but not every solution fits each need.

Here are some useful tips for choosing the most appropriate predictive analytics solution for your business, as well as recommendations for tools to consider.

Predictive analytics software analyzes data to look for patterns and predict likely future outcomes. It can improve efficiency, detect fraud, or give companies an edge over competitors.

Software solutions typically work far faster than human data scientists, and can find connections they mightve missed.

Predictive analytics software can also help supply chain professionals deliver orders faster while avoiding sellouts, overstocks, and other unwanted events in an increasingly challenging market. A 2021 study of people in the supply chain sector found that 31% of respondents currently use predictive analytics software. Plus, 48% expect to utilize it within the next five years.

Software solutions typically work far faster than human data scientists, and can find connections they mightve missed. That makes predictive analytics tools ideal for helping companies learn more about their customers, determine the best time to enter a new market, and reach other critical business goals.

Read more: What Is Predictive Analytics?

SAP Analytics Cloud covers a vast range of data analytics tools in one cloud-based suite. For example, natural language processing working in the background lets you ask questions in a conversational format and get instant, data-driven answers.

Being on the cloud makes it easy to deploy and scale. Plus, advanced machine learning features streamline many otherwise time-consuming tasks and let you unlock previously hidden details. The Smart Discovery tool applies machine learning to selected data sets, showing relevant patterns and relationships without human bias. Also, you wont need the help of a data science professional to avail of this feature.

The Smart Insights capability brings context and clarification to your data. At the same time, intelligent algorithms running in the background suggest additional visualizations that make your information more actionable and easy to understand. Although SAP Analytics Cloud includes many advanced options, it remains easy to use and accessible.

Azure Machine Learning is a simple, well-laid-out predictive analytics tool that anyone can start using quickly. It supports coding and non-coding operations, giving businesses more flexibility.

The drag-and-drop machine learning algorithm builder streamlines the creation and publishing processes. You can use it to automate data labeling tasks, too. If youre worried about unintentionally introducing bias into the models, this tools disparity metrics help you spot and remedy such problems early. Security is another selling point of this product. It enables setting up role-based access control for access and resource usage.

Since this tool works with numerous development frameworks and programming languages, its a convenient option for minimizing your organizations predictive analytics learning curve. Youll also appreciate only paying for what you need, since this Azure product does not require upfront investments. The ability to set workspace and resource-level quotas help you manage spending, too.

SAS Visual Statistics lets users see data on a granular level, unlocking new insights faster and achieving competitive advantages over competitors. Set up the interface to allow multiple people to interact with the information, whether adding or changing variables or dealing with outliers. Users can also instantly learn how changes alter a models predictive capabilities.

This tool also has self-service cleaning features with built-in artificial intelligence. You can access, clean, and transform data from a single intuitively designed interface, making it easy to stay productive and make the most of your companys information.

This cloud-based product is also excellent for companies that have data professionals working across multiple locations. People can seamlessly share visualizations, keeping output levels high regardless of geographical distance. Model scoring and comparison tools also help users choose which models to use as new information becomes available.

Sisense Fusion Analytics caters to people of all coding backgrounds. Whether you prefer a no-code, low-code, or code-first approach, this product will meet your needs. Data can be filtered into dashboards with no tech expertise required.

This product comes with built-in machine learning models for predictive analytics. Therefore, its an ideal choice if you want to start using your data for better decision-making without the typical upfront time investment required. An artificial intelligence feature allows you to simply type a question to begin examining the information in new, powerful ways.

This tool also handles large amounts of complex data, making it a wise choice if you want a highly scalable option that can immediately benefit your business. Sisense Fusion Analytics connects to numerous third-party apps, facilitating the easy importation of information regardless of its type or source. The open API framework also supports customizing the tool for different applications.

The Alteryx Intelligence Suite enables choosing one of three modes automated, assisted or expert to create machine learning models that help you focus on your businesss future. You can build data analytics and machine learning models in minutes, shortening the time before you can start capitalizing on the associated new insights.

The tool also has a text analytics feature that works on PDFs and other typed materials. Use it to discover sentiment analysis within customer groups to better determine how theyll respond to upcoming campaigns. Run automated data health checks to cleanse the information before running it through your predictive models.

Drag-and-drop functionality within this product lets you quickly start using automated features while working with structured and unstructured data. Also, choose from more than 70 preengineered features to increase a models predictive capabilities. Its also easy to improve your results with prepared data packages from reliable sources, such as Experian, TomTom, and the U.S. Census.

IBM SPSS is a platform featuring two main products. SPSS Statistics is for locating specific answers in data, while SPSS Modeler is for visualizing those answers. Both programs integrate with IBMs Watson product, which makes deployment simpler for many existing customers. The intuitive interfaces help people start getting the most of these products without long learning periods.

The Modeler comes with more than 40 machine learning models ready for immediate use. There are also in-database performance capabilities that minimize the need to move the information before analyzing it.

SPSS Statistics enables better data management, letting users prepare the information and extract meaningful insights from it. The tool allows making categorical predictions and creating custom tables among its features.

H2O.ai uses open-source artificial intelligence, making it easy to create custom features on the platform. Numerous automation options accelerate processes such as validation and cross-validation. There are also various constraints and parameter controls. They help target and minimize bias, creating accurate, trustworthy results.

Multiple industry-specific templates help businesses start using predictive analytics sooner. Plus, the low-code application development framework enables people to reduce time spent on the important but time-consuming parts of building predictive algorithms and apps.

Anaconda Enterprise is primarily a tool for companies that have data scientists on their teams. It offers extensive controls, robust security features, and more than 1,500 packages in Python and R. It also features several sample templates for repeatable tasks. Those help users get started with their projects faster.

Anaconda is available on multiple platforms and can scale up easily. If you need additional help learning to use this product, schedule a Kickstart that allows you to meet with data professionals who know the features and their capabilities inside and out. The built-in security features and failover controls limit IT team hassles, too.

Making predictive analytics work as effectively as possible for your business requires understanding the available features and choosing what would be most valuable for your intended use cases. Here are some examples of frequently requested features.

Many company leaders consider real-time reporting a must-have feature. Thats especially true if theyre part of fast-moving or high-value industries like health care or manufacturing. In sectors like those, failing to spot an impending problem can have catastrophic consequences.

Many company leaders consider real-time reporting a must-have feature.

Additionally, a tool with built-in data preparation features could save time for company representatives and result in better accuracy. After all, even the most advanced predictive analytics software is only as reliable as the information it receives. If you have thousands of duplicate, incomplete, or inconsistently formatted records, theyll make your outcomes less trustworthy.

Selecting a tool with multiple visualization options is also a wise move for many businesses. Users may need to examine the data from several perspectives to grasp the associated insights. Also, certain ways of presenting the information may be easier to digest for board members or others without data science backgrounds.

Scalability is another characteristic you may want your predictive analytics tool to have. Consider your current needs and how they could change over the next three to five years. Is there a possibility a business expansion, merger, or another event could increase the data you need to analyze and the type of events you want to predict?

Every company has unique needs and goals for its predictive analytics operations. These options should provide something for every business, helping firms of all sizes, industries, and expertise capitalize on predictive analytics.

Before investing in one of these tools, take the time to understand what your company hopes to achieve with the solution. Doing that will set expectations and help you find the most appropriate and valuable product.

Read next: Top Big Data Tools & Software for 2021

Read more:
Best Predictive Analytics Software for 2021 - CIO Insight

Starting a health technology company in Baltimore? Here are 43 resources to know – Technical.ly

The digital health sector was growing rapidly. Then the pandemic came.

In this case, it was a story of growth.

Following distancing requirements and loosening regulations, telemedicine became the name of the game for many healthcare systems across the nation. It was an area where Baltimore startups could add value, whether it was the digital front door of b.well Connected Health or Tissue Analytics work with healthcare institutions to remotely monitor woundcare patients.

Now, healthcare leaders are learning from the shifts of 2020, and planning for how theyll keep the useful technology that was implemented during the pandemic in place. Going forward, hospitals, clinics and healthcare providers will keep seeking out ways to modernize systems, and adopt new methods of care. So its a good bet that entrepreneurs will be building tools to help them.

In Baltimore, entrepreneurs will find an existing network of resources for healthcare-focused startups that are growing around the citys university-focused healthcare institutions. It has helped to nurture new software and data-powered tools under the umbrella of digital health. There are also resources focused on the hardware that goes into medical devices, and the diagnostics and therapeutics that help to detect and prevent disease.

Drawing on nearly a decade of Technical.lys reporting in the city, weve compiled a look at resources for startups looking to break in at the intersection of health and tech:

These monthly health tech-centered Zoom sessions are a great way to meet those further along in the entrepenuer journey, whether its featured panelists or other attendees.

The next event is on Oct. 20 at 5 p.m., and will be led by Kelliann Wachrathit. She will share her unique perspectives from integrating her background in bioengineering, regulatory research at the U.S. Food and Drug Administration, regulatory consulting, assessing intellectual property and nursing.

An initiative of the John Hopkins Technology Innovation Center, this network is raising a flag for digital health throughout the region. Its connecting key players, building a knowledge base, directly supporting young ventures and investing in the talent pipeline with CEO roundtable events, innovation labs and workshops. It is funded by a a three-year, $1.3 million grant from the U.S. Department of Commerce Economic Development Administration,

Coming up in November, Chesapeake DHX is organizing a five-day innovation lab on cancer data with cancer researchers, biologists and imaging scientists.

Alongside joining the Maryland Business Innovation Association itself, entrepreneurs were given the opportunity in 2021 to participate in the challenge and work with companies from the local corporate community, like CareFirst innovation arm Healthworx, on problem-solving. It offered a chance to create new professional relationships while testing product solutions on the market.

Organized jointly by Johns Hopkins Technology Ventures and the University of Maryland, Baltimore, this event series gathers local entrepreneurs and investors regularly to learn about all of the resources the city can provide, and offer networking. The next event is set for October 21, and centers around building equitech in Baltimore with accelerator network Techstars and local ecosystem builder Upsurge Baltimore as the featured organizations.

Technical.ly (hi!) has an online community providing space for technologists and entrepreneurs across the cities we cover, including Baltimore, DC, Philly, Delaware and Pittsburgh. In the Slack, youll not only get to connect with fellow entrepreneurs but get an opportunity to contribute to reporting engaging with the Technical.ly community.

This recently-launched incubator from LifeBridge Health and Carefirsts Healthworx is seeking companies that are beyond the idea stage, with a full-time team, a product or model that solves a key healthcare challenge and traction like an MVP, early customers or financing. The program offers investment of up to $100,000 for each selected company, with each partner contributing up to $50,000. Its currently in the midst of its first cohort.

This Loyola University Marlyand program accepts applications from companies across Baltimore, which can include healthcare-focused startups. It offers a chance to connect with mentors, partake in programming to build a business, and grow a network in Baltimore.

ETC (Emerging Technology Centers) opened the citys first seed accelerator program in 2012. With support from the Abell Foundation, it has supported a variety of companies since, including plenty building healthcare technology.

Applications are currently open for this accelerator through December 5. The program is looking for five startups that want to solve some of the citys most pressing challenges. Businesses that are chosen for the 13 week program receive $50,000 in seed funding.

With a focus on medical device manufacturing, the medtech venture center inside Port Covingtons City Garage doubles as a coworking space and incubator to guide companies from start to launch. Led by longtime Maryland healthcare ecosystem leader Bob Storey, it has emerged as a base for companies that are ready to exit the incubator, and move into production.

Drawing on a model built at the National Science Foundation, I-Corps is designed to help scientists and engineers as they move from a discovery to a company. The multi-week program offers a chance to identify valuable product opportunities that can emerge from academic research, and gain skills in entrepreneurship through training in customer discovery and guidance from established entrepreneurs. Its offered at no cost, and provides a $2,880 grant to participants that complete the program. Upcoming sessions are expected to start October 28, January 2022 and April 2022.

Located downtown on West Pratt Street, this startup studio supports medical device inventors who want to move their ideas beyond the lab into the marketplace. Entrepreneurs that work with the studio can expect a lot of hands-on development from the organization as they take an idea to minimum viable product with the studios internal team of surgeons, neurologists and engineers. Think of it as a place that helps build startups by loaning a team of employees that are experts in their field.

Billing iself as Marylands only hospital-based bioincubator, the Sinai Hospital facility offers lab space for growing startups, with the added benefit of being connected to resources and the network of LifeBridge Health.

Located at the universitys hospital campus in East Baltimore, this space has both offices and wet labs for early-stage companies. Opened in 2017, it has become a nexys of activity for emerging local companies in life sciences and digital health.

Located across from the University of Maryland, Baltimore campus west of downtown Baltimore, the 14-acre biomedical research park is home to a cluster of companies working on new products in various areas of healthcare. It has served as a home both for companies that grow out of work in the city, or others looking to open an office in Baltimore. In 2017, UMB added to its offerngs by opening its innovation hub, called The Grid, to serve as a connecting point for entrepreneurs and the community on campus.

The Pigtown-based venture studio works with researchers to start new companies that commercialize discoveries inside the states institutions. Among many companies in its portfolio are a pair of ventures focused on computer-aided drug design. The company opened production space in Pigtown last year.

The Johns Hopkins Center for Bioengineering Innovation and Design offers masters level training and a chance to explore where healthcare needs solutions. Students work side-by-side with clinicians to understand problems and develop ideas to solve them. It has resulted in the launch of many new startups that called Baltimore home, such as the aforementioned Tissue Analytics.

The Rockville-based organization has long had a statewide presence growing the work that turns discoveries made in the lab into the key technologies that power commercial companies. It connects the community through events, as well as assisting inventors with promising IP as they connect with investors and other key resources.

This searchable AI-powered site to aims to connect the state ecosystem with info on the many accelerators, incubators, funds and mentors that startup founders can turn to for support. In 2021,TEDCO, the Maryland Department of Commerceand theUniversity System of Maryland teamed up to launch this resource, which was built by Baltimore-based EcoMap Technologies.

Baltimore is part of the DMV-wide chapter for this group bringing women in the BioHealth Capital Region together. It has regular events, a mentoring group and more.

The life sciences-focused arm of the Maryland Tech Council, this trade group has a host of activities connecting companies among its membership, and works on advocacy efforts to advance the industry. It also stages 20 events during the year, like the recently-completed Bio Innovation Conference.

Dubbed Americas seed fund, the federal government provides non-dilutive research and development grants to companies working to commercialize technologies. The program has been a prime source for early funding at many Maryland startups working in healthcare. Marylands TEDCO and Rockville-based OST Global Solutions host an SBIR Proposal Lab to help companies prepare an application.

The state-backed agency is an active funder of early-stage healthcare ventures through a variety of its funds. The Maryland Stem Cell Research Fund provides grants and loans for research work. The Maryland Innovation Initiative funds efforts to commercialize technology coming from the states universities. For companies ready for venture capital, TEDCO also operates pre-seed, seed and venture funds.

Early-stage businesses seek venture funding to grow. Drawing from a Technical.ly article that listed 75+ venture capital firms in the DMV area, here are local firms that seek to fund healthcare-focused tech companies:

An open source server for Fast Healthcare Interoperability Resources also known as FHIR (pronounced fire) which is the server patients use to access records. The Annapolis Junction-based companys technology won the Office of the National Coordinator for Health Information TechnologysSecure FHIR Server Challengebecause of its focus on security measure and design to accommodate multiple types of systems.

PHP is an open source scripting language for web development that offers easy data integration. If youre worried about viability, major companies like Facebook, Slack and Lyft use it in their tech stack. But any coding language is just as good it just depends on where your company is specializing to release its client. If IOS user is your goal may be Swift is the way to go

This cross-platform video editing platform comes in handy. In the digital age, you need to be able to engage on multiple platforms and video is the perfect way to garner interests. The main benefit of this software is its free. If youre early stage with little funding but trying to add video elements to a pitch deck or product showcase, this might be the software for you.

Whod we miss? Let us know at baltimore@technical.ly.

Originally posted here:
Starting a health technology company in Baltimore? Here are 43 resources to know - Technical.ly

Developer jobs: When it comes to building diverse teams, employers are still missing the mark – ZDNet

Companies must the tools at their disposal to build diverse teams.

The tech industry has been facing increasing scrutiny over diversity, equity and inclusion issues in recent years. Even though having a policy aimed at hiring a diverse team can bring employers a greater range of talent to improve just about every business outcome, progress in this area remains disappointingly slow.

According to a report by hiring platform Hiretual, organizations are doing little to boost diversity within their software teams. Analysis of candidate searches made via its platform between January 1st and July 31st this year found that just 15% of searches applied a diversity, equity and inclusion (DE&I) filter to find developers from Black, Hispanic, Asian or Native American backgrounds.

In the US tech hub of San Francisco, just 9% of searches were for Black software engineers, 7% for Hispanic or Latinx candidates, 6% for Native American, and 3% for candidates from Asian backgrounds, the company's data found.

See also:IT strategy: How an investment in diversity can boost your business.

Hiretual's report included searches from the US cities where search volume for software engineers was highest: San Francisco, CA; Seattle, WA; New York, New York; Boston, MA; Washington DC; Atlanta, GA; Los Angeles CA; Chicago, IL; and Denver, CO.

The highest volume of searches for Black, Hispanic/Latinx, Native American and Asian candidates came from Washington DC 17%, 14%, 13% and 10%, respectively.

Women were equally underrepresented in candidate searches, the data found: again, just 15% of searches on the platform were for female candidates specifically. Washington once again proved the most progressive here, with 26% of the total search volume accounting for women.

"The technology is there to help find and hire diverse candidates," said Steven Jiang, CEO at Hiretual. "Companies have the option to make a difference -- there are no more excuses."

According to the US Labor Department, job openings hit 10.1 million by the end of June 2021, up from 9.5 million in May.

Developers and software engineers have been amongst the most coveted tech workers throughout 2020 and 2021, owing to a boom in the demand for digital services and application development caused by the pandemic. Hiretual found some companies are now paying their talent up to 13.2% more than the national average, with 55% of US companies paying software engineers an average salary of $90,000 to $150,000 per year.

According to Hiretual's data, there are currently 1.2 million professionals with a software engineer title, and some 170,000 job postings for software engineers. It also found that 12% of professionals have changed jobs within the past 12 months, while 7% of professionals have changed jobs within the past six months.

The fact there are twice as many software engineer job postings compared to software engineers moving jobs supports the claims that tech companies are having a difficult time filling open roles, and that the demand for talent exceeds the supply, said Hiretual.

See also: Digital transformation: Two CIOs explain how to make it work

Opening up job vacancies to a more diverse talent pool, therefore, seems like an obvious step in addressing existing skills gaps, as well as boosting an organization's overall success.

"Job candidates have a lot of leverage and options to consider right now. So, companies looking for the best talent need to be trying to make the strongest impact from the very first impression, which requires an understanding for what candidates find value in today: things like building more diverse and inclusive teams, offering competitive pay and more flexible location options," Jiang told ZDNet.

"The data shows that companies willing to do that stand to fill more jobs, but many are missing the mark."

Hiretual's report also analysed the software skills most frequently searched for by recruiters. Python programming was the most in-demand proficiency, it found, followed by Java, C++, C# and object-oriented programming (OOP). Rounding out the top 10 skills were JavaScript, distributed systems, Microservices, C/C++, Amazon Web Services (AWS) and GoLang.

More:
Developer jobs: When it comes to building diverse teams, employers are still missing the mark - ZDNet

Android 12 has been released to the Android Open Source Project – Engadget

Following a preview at I/O 2021 and multiple betas since then, the next version of Google's mobile operating system is ready for prime time. Android 12 is now officially available. But if you own an Android device, don't get excited just yet. With today's announcement, Google is uploading the source code to the operating system to the Android Open Source Project (AOSP). As things stand, the update isn't publicly available on any current devices. But that should change soon.

Google says it will start rolling out Android 12 to Pixel devices starting sometime in the "next few weeks," with availability on Samsung, OnePlus, Oppo, Realme, Tecno, Vivo and Xiaomi devices to follow later this year. Once the OS finally makes it to your device, you can look forward to checking out Google's new Material You design language, an updated privacy dashboard that includes a timeline of all the data the apps on your phone have accessed, the ability to capture scrolling screenshots, a new one-handed mode and more. Until then, the wait continues.

All products recommended by Engadget are selected by our editorial team, independent of our parent company. Some of our stories include affiliate links. If you buy something through one of these links, we may earn an affiliate commission.

Original post:

Android 12 has been released to the Android Open Source Project - Engadget

IBM and YMCA of Metropolitan Los Angeles Collaborate to Empower and Inform Voters with New Tech Solution for Underrepresented Communities – Yahoo…

Open source Call for Code for Racial Justice solution "Five Fifths Voter" deploys during National Voter Education Week to help educate voters and make their voices heard

ARMONK, N.Y. and LOS ANGELES, Oct. 5, 2021 /PRNewswire/ -- IBM (NYSE: IBM) and the YMCA of Metropolitan Los Angeles announced today during National Voter Education Week the deployment of a new solution designed to help improve awareness of local, statewide, and national issues in communities lacking traditional access to and education about voting. Five Fifths Voter is a web-based application built using open source-powered technology to help educate, empower and enable disenfranchised minority voters to overcome setbacks incurred by voter suppression.

Nationally, 66.8% of citizens 18 years and older cast ballots in the 2020 U.S. general election, with 62.6% of Blacks, 59.7% of Asians, and 53.7% of Hispanics voting, according to the United States Census Bureau1. Further, the Bureau notes that "Voter turnout also increased as age, educational attainment and income increased." In California, voting percentages and the breakdown of those numbers mirror the national trend. In non-presidential elections, voting percentages are significantly lower in all categories, as with the recent California Statewide Special (Recall) election.2 Although the latest numbers are all-time highs, there is still much more to be done, and IBM and YMCA-LA are working together to continue to improve voter participation by creating a more streamlined process for underrepresented voters.

Five Fifths Voter, available for both desktop and mobile browsers, provides one place where users can check their registration status, register to vote, and access information about deadlines, ballot drop-offs, and polling locations. It also offers resources tailored for specific circumstances including parents requiring childcare, people with disabilities, convicted felons, and senior citizens.

Story continues

"For decades, the YMCA has been an invaluable resource for young Angelenos and their families," said Los Angeles Mayor Eric Garcetti. "Through this collaboration with IBM, the Y will foster civic engagement and help to make our democracy more accessible to young people across the region."

The web app was customized to the diverse needs of Metropolitan Los Angeles' communities based on input from YMCA-LA and members from its Teens and Government program who participated in design thinking workshops with IBM and identified challenges they have encountered in voting. The customization includes translation of the app in 11 languages prevalent in the area, resources for young voters to prepare themselves to vote in upcoming elections, and clear steps people can follow to help their community engage in the civic process given the YMCA's focus on the youth and underrepresented demographic.

"The collaboration with IBM gives the YMCA-LA the virtual tools to educate and empower our communities with essential resources to participate in the voting process," said Mario Valenzuela, Vice President of Equity and Inclusion, YMCA-LA. "We're thankful for this opportunity to provide realistic solutions to this youth-led initiative to get out the vote and encourage community members to use their voice to facilitate positive change."

Five Fifths Voter was one of seven projects IBM made available for anyone to contribute code to last year through open source as part of Call for Code for Racial Justice, an effort to bring the developer community together to create practical tools that help tackle one of the greatest challenges of our time: racial injustice. The code is containerized and can be deployed across hybrid cloud environments, including multi-cloud with Red Hat OpenShift. Some of the open source frameworks and languages used include Vue.js, Node.js, Python, Apache CouchDB, and Carbon Design System. The web app is hosted on IBM Cloud; was built using technology including IBM Watson Tone Analyzer and Watson Natural Language Understanding; and uses the Google Civic and Google Maps APIs.

The Call for Code for Racial Justice projects, including Five Fifths Voter, are shared online so anyone can contribute code to enhance them with a focus on three key areas: police and judicial reform and accountability, diverse representation, and policy and legislation reform. Call for Code for Racial Justice is part of the broader Call for Code tech-for-good initiative in which more than 500,000 developers and problem solvers across 180 nations have participated since launch in 2018 to address problems such as climate change and COVID-19 with open source-powered software.

"Working with strong community organizations like YMCA-LA was our goal when we first created Call for Code for Racial Justice," IBM Call for Code Director Ruth Davis said. "We believe bringing together developers, ecosystem partners, and communities around the world can drive lasting impact in the fight against systemic racism and are looking forward to enhancing Five Fifths Voter's capabilities and bringing it to more communities."

IBM and YMCA-LA invite developers in the Greater Los Angeles area and around the United States to contribute their ideas and code to the solution to further enhance it and make it even more relevant to their local communities' needs.

About IBMIBM is a global leader in hybrid cloud and AI, serving clients in more than 170 countries. More than 2,800 clients use our hybrid cloud platform to accelerate their digital transformation journeys and, in total, more than 30,000 of them have turned to IBM to unlock value from their data. With this foundation, we continue to leverage Red Hat OpenShift as the leading platform to address our clients' business needs: A hybrid cloud platform that is open, flexible and secure. Guided by principles of trust, transparency and support for a more inclusive society, IBM also is committed to being a responsible steward of technology and a force for good in the world. For more information, visit: http://www.ibm.com.

About YMCA of Metropolitan Los AngelesThe YMCA-LA is committed to rebuilding communities by providing equitable programs and services to empower all Angelenos. The Y-LA is focused on fighting food insecurity, providing equity in education, making sure every child has the opportunity to experience the joy of sports, ensuring kids and teens have a safe place to grow, learn and live a healthy lifestyle. The LA-Y's health and wellness initiatives offer medical and mental health resources to ensure everyone has access to basic health needs. During the pandemic, the LA-Y became the safety net for millions of Angelenos. They provided millions of meals, hundreds of hours of free child care, arranged critical blood drives, provided showers for the homeless, flu and COVID vaccines as well as medical and mental health assistance. Visit ymcala.org for more information.

IBM Media Contact:Mike Sefanovmike.sefanov@ibm.com650-281-8099

YMCA-LA Media Contact:Lisa Vegalisa@lisavegagroup.com213-247-3075

_________________________

1 https://www2.census.gov/programs-surveys/cps/tables/p20/585/table04b.xlsx2 Real-time data from the California Secretary of State

IBM Corporation logo. (PRNewsfoto/IBM)

Cision

View original content to download multimedia:https://www.prnewswire.com/news-releases/ibm-and-ymca-of-metropolitan-los-angeles-collaborate-to-empower-and-inform-voters-with-new-tech-solution-for-underrepresented-communities-301392315.html

SOURCE IBM

Read the original:

IBM and YMCA of Metropolitan Los Angeles Collaborate to Empower and Inform Voters with New Tech Solution for Underrepresented Communities - Yahoo...

Lidar developer Ouster agrees to buy Sense Photonics as it takes aim at the auto industry – Yahoo Tech

Ouster, a lidar company that went public this year via a SPAC merger, said it would acquire solid-state lidar startup Sense Photonics in an all-stock deal that was valued at around $68 million at close of markets on Monday.

Once the acquisition is complete, Ouster said it would establish a new business arm, Ouster Automotive, which will be headed by Sense CEO Shauna McIntyre. That business will integrate Senses 200-meter range solid-state lidar into a new lidar suite. San Francisco-based Senses claim to fame is also its improved field of view, as TechCrunchs Devin Coldewey explained.

According to a news statement, Ouster Automotive will also aim to advance negotiations with five automotive OEMs, though additional details about these potential deals were not provided. Should they turn into something solid, production would begin in 2025 or 2026.

Lidar is a key sensor in most autonomous driving stacks. The sensor, whose name is a shortened form of light detection and ranging, measures distance using lasers to generate a 3D map of the world. Along with radar, cameras and software, lidar is a critical part of the AV systems of some of the leading developers today, including companies like Waymo and Argo AI.

In February, Ouster CEO Angus Pacala said on the podcast Shift that the future of the lidar industry would be marked by consolidation. Theres going to be three to five lidar companies within the next five years, he said. This new acquisition is a mark that Ouster will be at the forefront of turning this prediction into a reality.

Earlier this year, Ouster completed a merger with a blank-check firm in a deal valued at $1.9 billion. It joined rival lidar companies Luminar, Innoviz and Velodyne in taking the SPAC route to the public market. Ousters stock hit a year-to-date trading high of $15.39 in February; today, its trading for $7.41.

Update: A spokesperson for Ouster confirmed to TechCrunch that the company expects the majority of Sense's 80 employees to join McIntyre in joining Ouster.

The spokesperson added, "Ouster's perspective has always been that auto OEMs want a multi-sensor suite of solid-state lidar, long to short-range, that can be manufactured at scale and integrated into the body of the vehicle for the low hundreds of dollars. That's exactly what we're planning to offer."

Originally posted here:

Lidar developer Ouster agrees to buy Sense Photonics as it takes aim at the auto industry - Yahoo Tech

Pegasus The Humanitarian Costs of Insecure Code – Security Boulevard

PegasusThe Humanitarian Costs of InsecureCode

A look at the nature and effects of legal, advanced spyware on application security

Typically, stories about cyber attacks grab the readers attention by describing the damage inflicted on a company in large dollar amounts. While multimillion-dollar ransomware demands are shocking, they can be quickly forgotten. After all, these situations are eventually worked out, and its not as if anyones life is indanger.

Pegasus attacks are different.

Pegasus attacks on iPhone and Android devices do not cost businesses millions in revenue. They do not trigger multiple expensive lawsuits for privacy violations or result in sensitive data being used for blackmail. Pegasus measures its damage by its chilling effect on privacy, the incalculable costs of information suppression, and in some cases, humanlives.

Pegasus is an advanced spyware that exploits vulnerable mobile apps to gain a foothold on iPhone and Android devices. Once installed, Pegasus gives attackers a considerable amount of control over the device, including the abilityto:

Pegasus is the creation of the NSO Group, an Israeli firm that licenses it to governments to perform surveillance. NSO states its technology is intended to prevent and investigate terrorism and crime to save thousands of lives around the globe. However, Pegasus is a highly sophisticated tool, and like any tool its use is only as benevolent as the hand that wields it. The spyware allows governments to crack citizens mobile devices, track them, and observe their communications. Whether it is solely used to target criminals is up to their discretion.

On the iPhone, Pegasus uses a zero-click attack against the iOS iMessage app to infect the device. A zero-click attack is one that requires no cooperation or interaction from the victim to succeed. Typically, these attacks directly exploit known app vulnerabilities and use data verification loopholes to avoid automated detection and other security features. Zero-click attacks also take lengthy steps to remove or obfuscate all traces of their existence, making them extremely difficult for threat researchers todetect.

Pegasus is easier to deploy on Android and can move laterally to exploit secondary attack vectors if the primary method of infection fails. The Android version of Pegasus does not rely on a zero-click attack but, uses Framaroot to discover code exploits and root the device. Android, by design, does not keep the logs researchers use to identify a Pegasus infection. In fact, researchers must often use special tools to detect the presence of Pegasus onAndroid.

Both the Android and iPhone versions of Pegasus ultimately rely on exploiting vulnerable code. Yet, the spyware is so sophisticated that detecting its presence does little to reveal how it infiltrates a device. This is evident from the sheer length of time that iPhone users have struggled with Pegasus. Media outlets first reported the existence of the spyware in 2016. Apple released a quick fix for iMessage shortly afterward. Yet, the most recent iOS fix for Pegasus arrived on September 13, 2021five yearslater.

On July 18th Amnesty International and Forbidden Stories (a Paris-based non-profit), named 50,000 individuals as potential targets of Pegasus attacks. Among the names were journalists, activists, politicians and other people of interest. The list was initially leaked to Forbidden Stories, who shared it with the media. The Amnesty International Security Lab collected a small sample of phones from members of the list and tested them for Pegasus infections. The lab discovered Pegasus indicators on 37 of 67phones.

In response, NSO Group released a statement denying any wrongdoing and criticizing the methodology used by the lab. They reiterated their commitment to only serving law enforcement and intelligence agencies of vetted governments. NSO stated they do not operate Pegasus for clients or have access to internal client data. Therefore, they could not possibly possess or leak a list oftargets.

Governments named by Amnesty International for violating their citizens privacy likewise denied any wrongdoing. In India, several journalists, opposition leaders, and three state officials were identified as appearing on the list. Forensic tests on 22 of the smartphones belonging to suspected Indian targets revealed that 10 were attacked by Pegasus. The Indian Government responded by denying they use Pegasus to target non-criminals.

One aspect that sets Pegasus apart from other malware is its focus on individual targets. While ransomware and APT groups may conduct surveillance on their targets before launching an attack, they are seldom concerned with individuals. Malware campaigns may involve spear-phishing or whaling attacks against high-ranked individuals, but the goal is usually obtaining their account credentials or access. Pegasus is deployed to directly monitor the individual, not steal their account privileges.

Likewise, traditional malware attacks usually focus on stealing money, hijacking data, or disrupting the operations of an organization. They almost always inflict financial damage through blackmail, extortion, regulatory fines, information theft, or harming the brand name. The damage Pegasus inflicts is personal and applies directly to the individual. This means developers accustomed to weighing the financial risks of vulnerable code should also consider humanitarian risks aswell.

Pegasus also highlights the wide spectrum of adversaries devs are facing. The tactics techniques and procedures (TTPs) of APTs and black-hat hackers are well known and generally understood. Their attacks are unlawful, meaning compromised organizations can generally rely on the support of law enforcement. NSO is a well-funded private company and its customers are governments and law enforcement agencies. This makes it unlikely that anyone officially deploying Pegasus will be considered a criminal. When cracking security on an individuals mobile device is not a crime, the app developer becomes the sole line of defense against Pegasus-like attacks.

Pegasus, like 84% of all cyber attacks, relies on exploiting vulnerabilities in the application layer to succeed. This makes application security testing through methods like SAST, DAST, IAST, and SCA key to preventing these attacks. Simply put, depriving organizations like NSO of vulnerabilities to exploit is the best way to stop them. Once vulnerable code is released it can be extremely difficult to discover how it is exploited. If Apple, the worlds largest company, is still patching iMessage five years after the first Pegasus infection what chance do smaller businesses have?

Open-source code presents another problem. Many open-source libraries contain known vulnerabilities, yet 96% of proprietary applications contain open-source code. Simple steps like checking open-source code dependencies with tools like Intelligent SCA (I-SCA) can greatly improve application security by alerting development teams to these vulnerabilities. Likewise, static code analysis like next-generation SAST (NG-SAST) can provide developers with daily or weekly insight into vulnerabilities in custom and open source code. With these kinds of tools, it is possible to integrate security processes throughout the software development lifecycle to better protect user data in an application.

For more information on efficient ways to add security testing to the SDLC, visit Shiftleft.io.

PegasusThe Humanitarian Costs of Insecure Code was originally published in ShiftLeft Blog on Medium, where people are continuing the conversation by highlighting and responding to this story.

*** This is a Security Bloggers Network syndicated blog from ShiftLeft Blog - Medium authored by The ShiftLeft Team. Read the original post at: https://blog.shiftleft.io/pegasus-the-humanitarian-costs-of-insecure-code-6f5afe6f36a1?source=rss----86a4f941c7da---4

Read more here:

Pegasus The Humanitarian Costs of Insecure Code - Security Boulevard

Software security groups increased use of open source tech by 61% over 2 years – VentureBeat

The Transform Technology Summits start October 13th with Low-Code/No Code: Enabling Enterprise Agility. Register now!

BSIMM12 data indicates a 61% increase in software security groups identification and management of open source over the past two years, almost certainly due to the prevalence of open source components in modern software and the rise of attacks using popular open projects as vectors.

The growth in activities related to cloud platforms and container technologies show the dramatic impact these technologies have had on how organizations use and secure software. For example, Building Security In Maturity Model (better known as BSIMM) made only five observations of use orchestration for containers and virtualized environments in BSIMM10, while it made 33 observations two years later for BSIMM12 an increase of 560%.

Another emerging trend observed in the BSIMM12 research is that businesses are learning how to translate risk into numbers. Organizations are exerting more effort to collect and publish their software security initiative data, demonstrated by a 30% increase of the publish data about software security internally activity over the past 24 months.

BSIMM12 data also shows an increase in capabilities focused on inventorying software; creating a software bill of materials (BOM); understanding how the software was built, configured, and deployed; and the organizations ability to redeploy based on security telemetry.

Demonstrating that many organizations have taken to heart the need for a comprehensive up-to-date software BOM, the BSIMM activity related to those capabilities enhance application inventory with operations bill of materials increased from 3 to 14 observations over the past two years, a 367% increase.

The move from maintaining traditional operational inventories toward automated asset discovery and creating bills of material includes adding shift everywhere activities such as using containers to enforce security controls, orchestration, and scanning infrastructure as code.

BSIMM has grown from nine participating companies in 2008 to 128 in 2021, with now nearly 3,000 software security group members and over 6,000 satellite members (aka security champions).

This 2021 edition of the BSIMM report BSIMM12 examines anonymized data from the software security activities of 128 organizations across various verticals, including financial services, FinTech, independent software vendors, IoT, healthcare, and technology organizations.

Read the full report by BSIMM.

Go here to read the rest:

Software security groups increased use of open source tech by 61% over 2 years - VentureBeat

Top 10 Artificial Intelligence Open Source Projects with Python – Analytics Insight

Artificial intelligence has been one of the advanced topics in the tech industry. The implementation of AI applications is growing rapidly, and tech enthusiasts have to keep up with this evolving field to work with AI-driven tools and applications. One of the most popular programming languages that are implemented on AI and ML projects is Python. This article provides a list of open-source AI projects and applications with Python.

TensorFlow: It lists as one of the top open-source AI projects with Python. TensorFlow is a product of Google and helps developers in creating and training ML models. It has helped ML engineers convert prototypes into working materials quickly and efficiently. Currently, it has thousands of users worldwide and is a go-to solution for AI

Chainer: Chainer is a Python-based framework to work on neural networks. It supports multiple network architectures simultaneously, including recurrent nets, recursive nets, and feed-forward nets. Also, it allows CUDA computation so that the users can use GPU with very few lines of code.

PyTorch: PyTorch helps in research prototyping so that the users can deploy the products faster. It permits transmission between graph modes through TorchScript and provides distributed training that the users can scale. This model is available on multiple cloud platforms and has numerous tools in its ecosystem to support NLP, computer vision, and other solutions.

Shogun: It is a machine learning library and assists in creating efficient ML models. Shogun is not based on Python exclusively as it can be used with several other programming languages like C#, Lua, Ruby, and R, to name a few. It allows combining several algorithm classes and data presentations so that users can prototype data pipelines quickly.

Gensim: It is an open-source Python library that can analyze plain text files for a deeper understanding of the semantic structures, and also retrieve semantically similar files, and perform such other tasks. Like any other Python library, it is scalable and platform-independent.

Statsmodels: It is a Python module that provides classes and functions for the estimation of different statistical models, for conducting tests, and for statistical data exploration. It supports specifying models using R-style formulas and data frames.

Theano: Theano allows users to evaluate mathematical operations including multi-dimensional arrays efficiently. It is used in building deep learning projects. Theanos high speeds give tough competition to the C implementations for problems involving large amounts of data. It is programmed to take structures and convert them into efficient codes.

Keras:Keras is an accessible API for neural networks. It is based on Python and can also run on CNTK, TensorFlow, and Theano. It is written using Python and follows the best practices to reduce cognitive pressure. It makes working on deep learning projects more efficient.

NuPIC: It is an open-source project based on the theory of HTM (Hierarchical Temporal Memory). Its deep experience in theoretical neuroscience research has led to tremendous discoveries about how the brain works. Its deep learning systems have demonstrated impressive achievements.

Scikit-learn:It is a Python-based library of tools and applications that can be used for data mining and data analysis. This tool has excellent accessibility and is extremely easy to use. The developers have built it on NumPy and SciPy to facilitate efficiency for beginners and intermediates.

More here:
Top 10 Artificial Intelligence Open Source Projects with Python - Analytics Insight