Twitter Becomes a Tool of Government Censorship – The Wall Street Journal

By Vivek Ramaswamy and Jed Rubenfeld

Alex Berenson is back on Twitter after being banned for nearly a year over Covid-19 misinformation. Last week the former New York Times reporter settled his lawsuit against the social-media company, which admitted error and restored his account. The First Amendment does not apply to private companies like Twitter, Mr. Berenson wrote last week on Substack. But because the Biden administration brought pressure to bear on Twitter, he believes he has a case that his constitutional rights were violated. Hes right.

In January 2021 we argued on these pages that tech companies should be treated as state actors under existing legal doctrines when they censor constitutionally protected speech in response to governmental threats and inducements. The Biden administration appears to have taken our warning calls as a how-to guide for effectuating political censorship through the private sector. And its worse than we feared.

More:

Twitter Becomes a Tool of Government Censorship - The Wall Street Journal

More users want encryption, but the transition can be complicated for messaging apps – Marketplace

End-to-end encryption is a way to keep messages private. Its sometimes used by apps, which basically turn those messages into unintelligible chunks of data as soon as a user hits Send.

The idea is that no one except sender and recipient can access that message. Not hackers, not third parties, not even the app platform itself. And you have to have special keys stored on an individual device to decrypt it.

But many messaging platforms dont have this kind of encryption, and some provide it only as an option.

Kimberly Adams of Marketplace Tech spoke with Matthew Green, a professor at the Johns Hopkins Information Security Institute, about why more apps dont have end-to-end encryption by default. The following is an edited transcript of their conversation.

Matthew Green: One of the problems is that that services like Facebook Messenger, theyre designed to work across multiple different devices, right? And getting all of that to work with encryption is hard because it means you have to have encryption keys delivered to all those different phones. Thats challenging. And then law enforcement and platform abuse teams, theyre worried that people will break laws, send abusive pictures and so on. And end-to-end encryption is very nerve-wracking for those interests because they cant see the images.

Kimberly Adams: From a design standpoint, does it matter when you add encryption to a messaging service?

Green: Yes, it makes much more sense to add encryption from the beginning. If you design a new messaging service and it has encryption right from the start, like Signal, for example, then its really easy to deploy that. You can figure out each time you add a new feature, you say, How does this fit into the encryption? How do I do things? In the other direction, when youre basically going backwards to a very popular service that already does not use encryption, adding encryption can be challenging because you have to think about all these features you support, like multiple devices, working on arbitrary web browsers, bots, things like that. Each of those services has to be adapted to use encryption. And thats why Facebook Messenger in particular Facebook is now deploying encryption across all of its existing services its taking them a long time to figure out all those details.

Adams: How does money factor into the encryption debate? Because, I mean, these messaging services could potentially provide a lot of useful user data that could be monetized or used to create targeted ads. And I imagine if you have really good end-to-end encryption, that ability to monetize that content theoretically goes away, right?

Green: My impression is that a lot of these advertising-supported networks like Google and Facebook, they have more user data about you than they know what to do with. So for them, theres actually kind of a balance where, hey, yes, we could have access to all your private conversations and thoughts. But we already have so much data, do we really want to be the people who are mining your private conversations to get that? And thats why I think so many of these providers, particularly Facebook, are moving to encryption, is they just dont need that private conversation data. They already have enough.

Adams: What do you see as the demand moving forward by users, at least, for encrypted messaging apps and services?

Green: Well, one of the things thats been amazing to me is over the last year or so I use this app called Signal, which is a great thing. And I get notifications saying, So-and-so is on Signal. And it used to be that so-and-so was some computer scientist or technical person I work with. And nowadays, so-and-so is my neighbor who I dont even think knows how to use a computer. The impression I get is that people genuinely feel that private messages should be private. And so I think that now they know that the older systems arent very private, theyre happy to switch to these newer technologies that doesnt cause them any controversy or any pain.

Adams: As somebody who studies this all the time, how have you noticed, sort of, the public perception and knowledge around issues of encryption change?

Green: Encryption used to be one of those science fiction things. Youd see it on TV, you know, Star Trek, or youd see it on cop shows occasionally. But it was always a criminal using encryption. I think that whats really changed is that encryption has gone from this thing that was mostly used by mobsters or the bad guys on TV to something that everyone just kind of takes for granted. And we understand why, right? Because were all carrying our entire lives around with us, all our private conversations on this little computer in our pocket. And we really, really are sensitive to the fact, even if not consciously, were sensitive to the fact that all of our private information could go so easily. And I think nowadays, the people who think about this stuff, they think about encryption as basically the only antidote against, you know, losing everything that you care about. And so encryption has gone from being kind of an exotic, dirty word, to just being a technology that is there and protects us.

You may have heard last week that Meta is testing new encryption features in its Messenger app. The company has said it would take years to add more secure encryption to Messenger by default.

Meta made the announcement after it complied with court orders and released chat histories between a Nebraska woman and her teenage daughter. The messages are allegedly about the daughter seeking abortion services more than 20 weeks into her pregnancy, which is illegal in that state.

Meta has said its decision to roll out additional encryption features in Messenger is not related to that court order.

If you want to know how to test that new end-to-end encryption feature on your Messenger app, The Verge has a handy summary.

But if youre in the market to try an app thats already encrypted, PC mag published its take on the best, most secure messaging apps of 2022.

Theyre in no specific order: WhatsApp, Telegram and a favorite here in Washington, D.C. Signal.

Originally posted here:
More users want encryption, but the transition can be complicated for messaging apps - Marketplace

How to enable end-to-end encryption on Facebook Messenger – The Verge

While protecting your privacy online has been a subject of interest for a while now, events in the news for example, the chat history Facebook recently turned over to police have brought it front and center. But how do you protect your privacy while staying in touch with friends and relatives? While there are a number of messaging apps that boast increased privacy features, sometimes you cant persuade the people you want to keep in touch with to use them. What is your alternative? What, for example, if they insist on chatting with Facebook Messenger?

Well, you can start by using end-to-end encryption (E2EE) on Messenger.

Basically, end-to-end encryption means that nobody even Facebooks company Meta should be able to read what is in your chat. In short, this is accomplished by each partys account being assigned a special key; only the account with that key can unlock the message. Currently, Meta has E2EE available on its Messenger platform but only on a per-chat basis. The company has announced its intention to turn on E2EE by default soon, but in the meantime, if youre about to embark on a Messenger conversation that you want to keep private, heres how to turn it on. (The process is generally the same for both Android devices and iPhones.)

If youre already chatting with the person and decide you want to enable E2EE, you can do that as well.

From that information page, you can also go into Vanish mode, which will cause the conversation to vanish when you close the chat.

You can also decide at what time a message will vanish anywhere from five seconds to a day. This is called a disappearing (rather than a vanishing) message. To create one:

One thing to be aware of is that an encrypted conversation can only be between the people in that conversation and the devices they are using. If you start an encrypted conversation on one mobile device, you cant just move to another device and continue it; you have to sign in to the Messenger app on the other device and manually add it to the conversation. (The other participants will be notified that there was a new device added.)

In addition, you can take part in encrypted chats on the web using the Messenger app on Chrome, Safari, and Firefox. (In Firefox, ironically, private mode must be disabled.)

Update August 17th, 2022, 11:10AM ET: This article was originally published on August 16th, 2022, and has been updated to add information about disappearing messages.

Read the original post:
How to enable end-to-end encryption on Facebook Messenger - The Verge

Bellabeat is First Period and Pregnancy Tracking App and Wearable to Implement Private Key Encryption (AES-256) Security Feature to Protect Women’s…

PopularPeriod and Pregnancy Tracking Wearable for Women is One of Safest Cyber Security and Data Wise

SILICON VALLEY, Calif., Aug. 18, 2022 /PRNewswire/ --In early July,Bellabeatwas the first pregnancy tracker to roll out a new layer of data security to protect their base of all female end users' data in the wake of the United States Supreme Court overturning Roe Vs. Wade. Like many mobile apps, they had been using full end-to-end encryption of their Bellabeat mobile app for users of all of their Bellabeat wearable products, with end-to-end encryption being the common and secure way to protect customers' data. The company determined that to protect their health data, it is necessary to take data their security a step further without haste. As of August 17th, 2022, eighteen out of 25 reproductive health apps and wearable devices that Mozilla investigated for privacy and security practices received a*Privacy Not Includedwarning label. Bellabeat did not receive a warning label as they have been exceptionally public in immediately taking the following steps after the newest Roe vs. Wade ruling.

The newly implemented Private Key Encryption (AES-256) feature will enable all Bellabeat users to access and decrypt her data using a private key via her Bellabeat smartphone app. Any data stored on the Bellabeat servers will be in an encrypted form only. Thus, no one can access the Bellabeat servers (lawfully or unlawfully). Also, adding the extra layer of security where data stored on the company's servers cannot be read without holding an individual user's private key. The only person that can access the confidential health data and info in its decrypted form will be the Bellabeat customer herself. The private key is a password or a pin code that only the user herself knows or stores on her private device. Without that key, her data is unreadable. Ideally, implementing the new security feature gives full control and ownership of data to Bellabeat's end users. The company will therefore not be able to benefit from collecting end-user data in any shape or form, including for internal research or product improvements. Bellabeat executives determined that there was not a question in options and that users' safety at this time is of the utmost importance. The feature is currently in testing and will be rolled out within all Bellabeat products having women's' reproductive health tracking features (period and pregnancy data tracking) by end of July.

Story continues

The decision for the exceptional layer of data security comes in the wake of the U.S. Supreme Courts June 24th, 2022, ruling to overturn the landmark case Roe v. Wade, in which the Court ruled thatthe Constitution of the United States generally protects a pregnant woman's liberty to choose to have an abortion. The overturning of Roe Vs. wade now gives states a license to ban abortion. Thirteen U.S. states, mainly within the south and midwest, had trigger bans to be activated upon supreme court decision and will now start taking effect. Some immediately upon the ruling being released. As Bellabeat is a Women's health tracker with a specific focus on menstrual, reproductive, and fertility tracking, end-to-end encryption was determined to be of the utmost importance to protect the fast-growing companies' customers.

"Our business is helping women to track and understand their cycles and bodies. The Overturning of Roe Vs. Wade is a tremendous blow to women's rights. It is an incredibly sad and terrifying day for Women's health and Women's rights. Many women are now in fear of exactly what to share and where to share it. This ruling will change how health data and records are maintained offline with OBGYNs and primary care physicians, what women feel safe to disclose, and will grossly change how women will choose to share their reproductive information online. We will continue to be a safe and progressive space for women to track their cycles, fertility, and all wellness concerns," states Urska Sren Co-Founder of Bellabeat. "Incorporating the Private Key encryption feature means an extra layer of security designed to ensure our users' safety. This also means our end users can be sure that we are unable to leak or sell their data and that a breach or break within Bellabeat's servers will never mean a threat to their personal safety."

In a recentWall Street Journal articlelegal experts are quoted to say that in a scenario where Roe is overturned, your digital breadcrumbsincluding the kind that come from period trackerscould be used against you in states where laws criminalize aiding in or undergoing abortion.

"It is a horrific idea that your health data and digital breadcrumbs could be used against you to criminalize women making life-changing reproductive choices. It's not a sentiment reflected anywhere in healthcare or health rights for the male body. We stand with women everywhere and have taken the necessary steps End-to-end. We also do not sell or share our customer info," states Sandro Mur, CoFounder of Bellabeat. "The implementation of the Private Key Encryption ensures that we will never be placed in a position, as a company, where we could be forced to submit user's private health data in its readable form."

Bellabeat is a leader in creating wellness technology whose products include wearables are specifically made for women that track health, wellness, and reproductive info via The Bellabeat Ivy, Leaf Urban, and Leaf Chakra. Bellabeat is aimed exclusively at women and recently announced that they have started the process of submitting an official application to the FDA for their product, the Bellabeat Ivy. Obtaining a license from the U.S. Food and Drug Administration (FDA) would allow doctors and clinicians to officially use the Ivy wearable technology to monitor the menstrual cycle in the treatment of women. The Bellabeat Ivy is specifically made for women. In recent coverage, it has been seen as an outstanding health tracker to monitor and track a woman's menstrual cycle, fertility, postpartum depression symptoms, menopause symptoms, and more.

For media inquiries on the Bellabeat mobile app or additional quotes or interviews surroundingBellabeat data protectionupon the overturning of Roe Vs. Wade, please emailmtatum@bpm-prfirm.comor call 877.841.7244.

About Bellabeat

Bellabeat Inc. is a Silicon Valley company building tech-powered wellness products for women. The Bellabeat team previously released the Bellabeat Ivy and disruptive Leaf health tracking jewelry for women and the first smart water bottle powered by A.I. Bellabeat is now revolutionizing the FemTech space by taking natural cycles into account when creating its guided programs and Ivy Smart Bracelet, helping women reach their health goals more effectively and enjoyably. Visithttps://bellabeat.com/for additional information.

Media Contact:Monique Tatum877.841.7244342922@email4pr.com

Cision

View original content to download multimedia:https://www.prnewswire.com/news-releases/bellabeat-is-first-period-and-pregnancy-tracking-app-and-wearable-to-implement-private-key-encryption-aes-256-security-feature-to-protect-womens-data-in-the-wake-of-roe-vs-wade-overturn-301608919.html

SOURCE Bellabeat

Read the original post:
Bellabeat is First Period and Pregnancy Tracking App and Wearable to Implement Private Key Encryption (AES-256) Security Feature to Protect Women's...

Best Machine Learning Books to Read This Year [2022 List] – CIO Insight

Advertiser disclosure: We may be compensated by vendors who appear on this page through methods such as affiliate links or sponsored partnerships. This may influence how and where their products appear on our site, but vendors cannot pay to influence the content of our reviews. For more info, visit ourTerms of Use page.

Machine learning (ML) books are a valuable resource for IT professionals looking to expand their ML skills or pursue a career in machine learning. In turn, this expertise helps organizations automate and optimize their processes and make data-driven decisions. Machine learning books can help ML engineers learn a new skill or brush up on old ones.

Beginners and seasoned experts alike can benefit from adding machine learning books to their reading lists, though the right book depends on the learners goals. Some books serve as an entry point to the world of machine learning, while others build on existing knowledge.

The books in this list are roughly ranked in order of difficultybeginners should avoid pursuing the books toward the end until theyve mastered the concepts introduced in the books at the top of the list.

Machine Learning for Absolute Beginners is an excellent introduction to the machine learning field of study. Its a clear and concise overview of the high-level concepts that drive machine learning, so its ideal for beginners. The e-book format has free downloadable resources, code exercises, and video tutorials to satisfy a variety of learning styles.

Readers will learn the basic ML libraries and other tools needed to build their first model. In addition, this book covers data scrubbing techniques, data preparation, regression analysis, clustering, and bias/variance. This book may be a bit too basic for readers who are interested in learning more about coding, deep learning, or other advanced skills.

As the name implies, The Hundred-Page Machine Learning Book provides a brief overview of machine learning and the mathematics involved. Its suitable for beginners, but some knowledge of probability, statistics, and applied mathematics will help readers get through the material faster.

The book covers a broad range of ML topics at a high level and focuses on the aspects of ML that are of significant practical value. These include:

Several reviewers said that the text explains complicated topics in a way that is easy for most readers to understand. It doesnt dive into any one topic too deeply, but it provides several practice exercises and links to other resources for further reading.

Introduction to Machine Learning with Python is a starting point for aspiring data scientists who want to learn about machine learning through Python frameworks. It doesnt require any prior knowledge of machine learning or Python, though familiarity with NumPy and matplotlib libraries will enhance the learning experience.

In this book, readers will gain a foundational understanding of machine learning concepts and the benefits and drawbacks of using standard ML algorithms. It also explains how all of the algorithms behind various Python libraries fit together in a way thats easy to understand for even the most novice learners.

Python Machine Learning by Example builds on existing machine learning knowledge for engineers who want to dive deeper into Python programming. Each chapter demonstrates the practical application of common Python ML skills through concrete examples. These skills include:

This book walks through each problem with a step-by-step guide for implementing the right Python technique. Readers should have prior knowledge of both machine learning and Python, and some reviewers recommended supplementing this guide with more theoretical reference materials for advanced comprehension.

Hands-on Machine Learning with Scikit-Learn, Keras & TensorFlow provides a practical introduction to machine learning with a focus on three Python frameworks. Readers will gain an understanding of numerous machine learning concepts and techniques, including linear regression, neural networks, and deep learning. Then, readers can apply what they learn to practical exercises throughout the book.

Though this book is marketed toward beginners, some reviewers said it requires a basic understanding of machine learning principles. With this in mind, it may be better suited for readers who want to refresh their existing knowledge through concrete examples.

Machine learning for Hackers is written for experienced programmers who want to maximize the impact of their data. The text builds on existing knowledge of the R programming language to create basic machine learning algorithms and analyze datasets.

Each chapter walks through a different machine learning challenge to illustrate various concepts. These include:

This book is best suited for intermediate learners who are fluent in R and want to learn more about the practical applications of machine learning code. Students looking to delve into machine learning theory should opt for a more advanced book like Deep Learning, Hands-on Machine Learning, or Mathematics for Machine Learning.

Pattern Recognition and Machine Learning is an excellent reference for understanding statistical methods in machine learning. It provides practical exercises to introduce the reader to comprehensive pattern recognition techniques.

The text is broken into chapters that cover the following concepts:

Readers should have a thorough understanding of linear algebra and multivariable calculus, so it may be too advanced for beginners. Familiarity with basic probability theory, decision theory, and information theory will make the material easier to understand as well.

Mathematics for Machine Learning teaches the fundamental mathematical concepts necessary for machine learning. These topics include:

Some reviewers said this book leans more into mathematical theorems than practical application, so its not recommended for those without prior experience in applied mathematics. However, its one of the few resources that bridge the gap between mathematics and machine learning, so its a worthwhile investment for intermediate learners.

For advanced learners, Deep Learning covers the mathematics and concepts that power deep learning, a subset of machine learning that makes human-like decisions. This book walks through deep learning computations, techniques, and research including:

There are about 30 pages that cover practical applications of deep learning like computer vision and natural language processing, but the majority of the book deals with the theory behind deep learning. With this in mind, readers should have a working knowledge of machine learning concepts before delving into this text.

Read next: Ultimate Machine Learning Certification Guide

Read the original here:
Best Machine Learning Books to Read This Year [2022 List] - CIO Insight

AI and Machine Learning in Finance: How Bots are Helping the Industry – ReadWrite

Artificial intelligence and ML are making considerable inroads in finance. They are the critical aspect of variousfinancial applications, including evaluating risks, managing assets, calculating credit scores, and approving loans.

Businesses use AI and ML:

Taking the above points into account, its no wonder that companies like Forbes and Venture beat are usingAI to predict the cash flow and detect fraud.

In this article, we present the financial domain areas in which AI and ML have a more significant impact. Well also discuss why financial companies should care about and implement these technologies.

Machine learning is a branch of artificial intelligence that allows learning and improvement without any programming. Simply put, data scientists train the MI model with existing data sets and automatically adjust its parameters to improve the outcome.

According to Statista, digital payments are expected to show an annual growth rate of 12.77% and grow to 20% by 2026. This vast number of global revenues, done online requires an intelligent fraud system.

Source: Mordor Intelligence

Traditionally, to check the authenticity of users, fraud-detection systems analyze websites through factors like location, merchant ID, the amount spent, etc. However, while this method is appropriate for a few transactions, it would not cope with the increased transactional amount.

And, analyzing the surge of digital payments, businesses cant rely on traditional fraud-detection methods to process payments. This gives rise to AI-based systems with advanced features.

An AI and ML-powered payment gateway will look at various factors to evaluate the risk score. These technologies consider a large volume of data (location of the merchant, time zone, IP address, etc.) to detect unexpected anomalies, and verify the authenticity of the customer.

Additionally, the finance industry, through AI, can process transactions in real-time, allowing the payment industry to process large transactions with high accuracy and low error rates.

The financial sector, including the banks, trading, and other fintech firms, are using AI to reduce operational costs, improve productivity, enhance users experience, and improve security.

The benefits of AI and ML revolve around their ability to work with various datasets. So lets have a quick look at some other ways AI and ML are making roads into this industry:

Considering how people invest their money in automation, AI significantly impacts the payment landscape. It improves efficiency and helps businesses to rethink and reconstruct their process. For example, businesses can use AI to decrease the credit card processing (gettrx dot com card processing guide for merchants) time, increase automation and seamlessly improve cash flow.

You can predict credit, lending, security, trading, baking, and process optimization with AI and machine learning.

Human error has always been a huge problem; however, with machine learning models, you can reduce human errors compared to humans doing repetitive tasks.

Incorporating security and ease of use is a challenge that AI can help the payment industry overcome. Merchants and clients want a payment system that is easy to use and authentic.

Until now, the customers have to perform various actions to authenticate themselves to complete a transaction. However, with AI, the payment providers can smooth transactions, and customers have low risk.

AI can efficiently perform high volume; labor-intensive tasks like quickly scrapping data and formatting things. Also, AI-based businesses are focused and efficient; they have minimum operational cost and can be used in the areas like:

Creating more Value:

AI and machine learning models can generate more value for their customers. For instance:

Improved customer experience: Using bots, financial sectors like banks can eliminate the need to stand in long queues. Payment gateways can automatically reach new customers by gathering their historical data and predicting user behavior. Besides, Ai used in credit scoring helps detect fraud activity.

There are various ways in which machine learning and artificial intelligence are being employed in the finance industry. Some of them are:

Process Automation:

Process automation is one of the most common applications as the technology helps automate manual and repetitive work, thereby increasing productivity.

Moreover, AI and ML can easily access data, follow and recognize patterns and interpret the behavior of customers. This could be used for the customer support system.

Minimizing Debit and Credit Card Frauds:

Machine learning algorithms help detect transactional funds by analyzing various data points that mostly get unnoticed by humans. ML also reduces the number of false rejections and improves the real-time approvals by gauging the clients behavior on the Internet.

Apart from spotting fraudulent activity, AI-powered technology is used to identify suspicious account behavior and fraudulent activity in real-time. Today, banks already have a monitoring system trained to catch the historical payment data.

Reducing False Card Declines:

Payment transactions declined at checkout can be frustrating for customers, putting huge repercussions on banks and their reputations. Card transactions are declined when the transaction is flagged as fraud, or the payment amount crosses the limit. AI-based systems are used to identify transaction issues.

The influx of AI in the financial sector has raised new concerns about its transparency and data security. Companies must be aware of these challenges and follow safeguards measures:

One of the main challenges of AI in finance is the amount of data gathered in confidential and sensitive forms. The correct data partner will give various security options and standards and protect data with the certification and regulations.

Creating AI models in finance that provide accurate predictions is only successful if they are explained to and understood by the clients. In addition, since customers information is used to develop such models, they want to ensure that their personal information is collected, stored, and handled securely.

So, it is essential to maintain transparency and trust in the finance industry to make customers feel safe with their transactions.

Apart from simply implementing AI in the online finance industry, the industry leaders must be able to adapt to the new working models with new operations.

Financial institutions often work with substantial unorganized data sets in vertical silos. Also, connecting dozens of data pipeline components and tons of APIS on top of security to leverage a silo is not easy. So, financial institutions need to ensure that their gathered data is appropriately structured.

AI and ML are undoubtedly the future of the financial sector; the vast volume of processes, transactions, data, and interactions involved with the transaction make them ideal for various applications. By incorporating AI, the finance sector will get vast data-processing capabilities at the best prices, while the clients will enjoy the enhanced customer experience and improved security.

Of course, the power of AI can be realized within transaction banking, which sits on the organizations usage. Today, AI is very much in progress, but we can remove its challenges by using the technology. Lastly, AI will be the future of finance you must be ready to embrace its revolution.

Featured Image Credit: Photo by Anna Nekrashevich; Pexels; Thank you!

Read the original:
AI and Machine Learning in Finance: How Bots are Helping the Industry - ReadWrite

PhD Position – Machine learning to increase geothermal energy efficiency, Karlsruhe Institute – ThinkGeoEnergy

The Karlsruhe Institute of Technology in Germany has an open PhD position for a project that will use machine learning to model scaling formation in cascade geothermal operations.

The Karlsruhe Institute of Technology (KIT) in Germany currently has an open PhD position in the upcoming Machine Learning for Enhancing Geothermal energy production (MALEG) project. Interested applicants may visit the official KIT page for more details on the application. Submissions will be accepted only until September 30, 2022.

The target of the MALEG project is the design and optimization of cascade production schemes aiming for the highest possible energy output in geothermal energy facilities by preventing scaling. The enhanced scaling potential of lower return temperatures is one key challenge as geothermal cascade use becomes a more common strategy to increase efficiency.

The research will be focusing on the development of a machine learning tool to quantify the impact of the enhanced cooling on the fluid-mineral equilibrium and to optimize the operations economically. The tool will be based on results from widely-applied deterministic models and experimental data collected at geothermal plants in Germany, Austria and Turkey by our international project partners. Once fully implemented the MALEG-tool will work as a digital twin of the power plant, ready to assess and predict scaling formation processes for geothermal production from different geological settings.

The ideal candidate should hold a masters degree in geosciences or geophysics with sound interest in aqueous geochemistry and experience in numerical modeling.

Source: Karlsruhe Institute of Technology

Read the original:
PhD Position - Machine learning to increase geothermal energy efficiency, Karlsruhe Institute - ThinkGeoEnergy

Are You Making These Deadly Mistakes With Your AI Projects? – Forbes

Since data is at the heart of AI, it should come as no surprise that AI and ML systems need enough good quality data to learn. In general, a large volume of good quality data is needed, especially for supervised learning approaches, in order to properly train the AI or ML system. The exact amount of data needed may vary depending on which pattern of AI youre implementing, the algorithm youre using, and other factors such as in house versus third party data. For example, neural nets need a lot of data to be trained while decision trees or Bayesian classifiers dont need as much data to still produce high quality results.

So you might think more is better, right? Well, think again. Organizations with lots of data, even exabytes, are realizing that having more data is not the solution to their problems as they might expect. Indeed, more data, more problems. The more data you have, the more data you need to clean and prepare. The more data you need to label and manage. The more data you need to secure, protect, mitigate bias, and more. Small projects can rapidly turn into very large projects when you start multiplying the amount of data. In fact, many times, lots of data kills projects.

Clearly the missing step between identifying a business problem and getting the data squared away to solve that problem is determining which data you need and how much of it you really need. You need enough, but not too much. Goldilocks data is what people often say: not too much, not too little, but just right. Unfortunately, far too often, organizations are jumping into AI projects without first addressing an understanding of their data. Questions organizations need to answer include figuring out where the data is, how much of it they already have, what condition it is in, what features of that data are most important, use of internal or external data, data access challenges, requirements to augment existing data, and other crucial factors and questions. Without these questions answered, AI projects can quickly die, even drowning in a sea of data.

Getting a better understanding of data

In order to understand just how much data you need, you first need to understand how and where data fits into the structure of AI projects. One visual way of understanding the increasing levels of value we get from data is the DIKUW pyramid (sometimes also referred to as the DIKW pyramid) which shows how a foundation of data helps build greater value with Information, Knowledge, Understanding and Wisdom.

DIKW pyramid

With a solid foundation of data, you can gain additional insights at the next information layer which helps you answer basic questions about that data. Once you have made basic connections between data to gain informational insight, you can find patterns in that information to gain understanding of the how various pieces of information are connected together for greater insight. Building on a knowledge layer, organizations can get even more value from understanding why those patterns are happening, providing an understanding of the underlying patterns. Finally, the wisdom layer is where you can gain the most value from information by providing the insights into the cause and effect of information decision making.

This latest wave of AI focuses most on the knowledge layer, since machine learning provides the insight on top of the information layer to identify patterns. Unfortunately, machine learning reaches its limits in the understanding layer, since finding patterns isnt sufficient to do reasoning. We have machine learning, not but the machine reasoning required to understand why the patterns are happening. You can see this limitation in effect any time you interact with a chatbot. While the Machine learning-enabled NLP is really good at understanding your speech and deriving intent, it runs into limitations rying to understand and reason.For example, if you ask a voice assistant if you should wear a raincoat tomorrow, it doesn't understand that youre asking about the weather. A human has to provide that insight to the machine because the voice assistant doesnt know what rain actually is.

Avoiding Failure by Staying Data Aware

Big data has taught us how to deal with large quantities of data. Not just how its stored but how to process, manipulate, and analyze all that data. Machine learning has added more value by being able to deal with the wide range of different types of unstructured, semi-structured or structured data collected by organizations. Indeed, this latest wave of AI is really the big data-powered analytics wave.

But its exactly for this reason why some organizations are failing so hard at AI. Rather than run AI projects with a data-centric perspective, they are focusing on the functional aspects. To gain a handle of their AI projects and avoid deadly mistakes, organizations need a better understanding not only of AI and machine learning but also the Vs of big data. Its not just about how much data you have, but also the nature of that data. Some of those Vs of big data include:

With decades of experience managing big data projects, organizations that are successful with AI are primarily successful with big data. The ones that are seeing their AI projects die are the ones who are coming at their AI problems with application development mindsets.

Too Much of the Wrong Data, and Not Enough of the Right Data is Killing AI Projects

While AI projects start off on the right foot, the lack of the necessary data and the lack of understanding and then solving real problems are killing AI projects. Organizations are powering forward without actually having a real understanding of the data that they need and the quality of that data. This poses real challenges.

One of the reasons why organizations are making this data mistake is that they are running their AI projects without any real approach to doing so, other than using Agile or app dev methods. However, successful organizations have realized that using data-centric approaches focus on data understanding as one of the first phases of their project approaches. The CRISP-DM methodology, which has been around for over two decades, specifies data understanding as the very next thing to do once you determine your business needs. Building on CRISP-DM and adding Agile methods, the Cognitive Project Management for AI (CPMAI) Methodology requires data understanding in its Phase II. Other successful approaches likewise require a data understanding early in the project, because after all, AI projects are data projects. And how can you build a successful project on a foundation of data without running your projects with an understanding of data? Thats surely a deadly mistake you want to avoid.

See the original post here:
Are You Making These Deadly Mistakes With Your AI Projects? - Forbes

Prediction of mortality risk of health checkup participants using machine learning-based models: the J-SHC study | Scientific Reports – Nature.com

Participants

This study was conducted as part of the ongoing Study on the Design of a Comprehensive Medical System for Chronic Kidney Disease (CKD) Based on Individual Risk Assessment by Specific Health Examination (J-SHC Study). A specific health checkup is conducted annually for all residents aged 4074years, covered by the National Health Insurance in Japan. In this study, a baseline survey was conducted in 685,889 people (42.7% males, age 4074years) who participated in specific health checkups from 2008 to 2014 in eight regions (Yamagata, Fukushima, Niigata, Ibaraki, Toyonaka, Fukuoka, Miyazaki, and Okinawa prefectures). The details of this study have been described elsewhere11. Of the 685,889 baseline participants, 169,910 were excluded from the study because baseline data on lifestyle information or blood tests were not available. In addition, 399,230 participants with a survival follow-up of fewer than 5years from the baseline survey were excluded. Therefore, 116,749 patients (42.4% men) with a known 5-year survival or mortality status were included in this study.

This study was conducted in accordance with the Declaration of Helsinki guidelines. This study was approved by the Ethics Committee of Yamagata University (Approval No. 2008103). All data were anonymized before analysis; therefore, the ethics committee of Yamagata University waived the need for informed consent from study participants.

For the validation of a predictive model, the most desirable way is a prospective study on unknown data. In this study, the data on health checkup dates were available. Therefore, we divided the total data into training and test datasets to build and test predictive models based on health checkup dates. The training dataset consisted of 85,361 participants who participated in the study in 2008. The test dataset consisted of 31,388 participants who participated in this study from 2009 to 2014. These datasets were temporally separated, and there were no overlapping participants. This method would evaluate the model in a manner similar to a prospective study and has an advantage that can demonstrate temporal generalizability. Clipping was performed for 0.01% outliers for preprocessing, and normalization was performed.

Information on 38 variables was obtained during the baseline survey of the health checkups. When there were highly correlated variables (correlation coefficient greater than 0.75), only one of these variables was included in the analysis. High correlations were found between body weight, abdominal circumference, body mass index, hemoglobin A1c (HbA1c), fasting blood sugar, and AST and alanine aminotransferase (ALT) levels. We then used body weight, HbA1c level, and AST level as explanatory variables. Finally, we used the following 34 variables to build the prediction models: age, sex, height, weight, systolic blood pressure, diastolic blood pressure, urine glucose, urine protein, urine occult blood, uric acid, triglycerides, high-density lipoprotein cholesterol (HDL-C), LDL-C, AST, -glutamyl transpeptidase (GTP), estimated glomerular filtration rate (eGFR), HbA1c, smoking, alcohol consumption, medication (for hypertension, diabetes, and dyslipidemia), history of stroke, heart disease, and renal failure, weight gain (more than 10kg since age 20), exercise (more than 30min per session, more than 2days per week), walking (more than 1h per day), walking speed, eating speed, supper 2h before bedtime, skipping breakfast, late-night snacks, and sleep status.

The values of each item in the training data set for the alive/dead groups were compared using the chi-square test, Student t-test, and MannWhitney U test, and significant differences (P<0.05) were marked with an asterisk (*) (Supplementary Tables S1 and S2).

We used two machine learning-based methods (gradient boosting decision tree [XGBoost], neural network) and one conventional method (logistic regression) to build the prediction models. All the models were built using Python 3.7. We used the XGBoost library for GBDT, TensorFlow for neural network, and Scikit-learn for logistic regression.

The data obtained in this study contained missing values. XGBoost can be trained to predict even with missing values because of its nature; however, neural network and logistic regression cannot be trained to predict with missing values. Therefore, we complemented the missing values using the k-nearest neighbor method (k=5), and the test data were complemented using an imputer trained using only the training data.

The parameters required for each model were determined for the training data using the RandomizedSearchCV class of the Scikit-learn library and repeating fivefold cross-validation 5000 times.

The performance of each prediction model was evaluated by predicting the test dataset, drawing a ROC curve, and using the AUC. In addition, the accuracy, precision, recall, F1 scores (the harmonic mean of precision and recall), and confusion matrix were calculated for each model. To assess the importance of explanatory variables for the predictive models, we used SHAP and obtained SHAP values that express the influence of each explanatory variable on the output of the model4,12. The workflow diagram of this study is shown in Fig.5.

Workflow diagram of development and performance evaluation of predictive models.

See original here:
Prediction of mortality risk of health checkup participants using machine learning-based models: the J-SHC study | Scientific Reports - Nature.com

Tackling the reproducibility and driving machine learning with digitisation – Scientific Computing World

Dr Birthe Nielsen discusses the role of the Methods Database in supporting life sciences research by digitising methods data across different life science functions.

Reproducibility of experiment findings and data interoperability are two of the major barriers facing life sciences R&D today. Independently verifying findings by re-creating experiments and generating the same results is fundamental to progressing research to the next stage in its lifecycle, be it advancing a drug to clinical development, or a product to market. Yet, in the field of biology alone, one study found that 70 per cent of researchers are unable to reproduce the findings of other scientists, and 60 per cent are unable to reproduce their own findings.

This causes delays to the R&D process throughout the life sciences ecosystem. For example, biopharmaceutical companies often use an external Contract Research Organisation (CROs) to conduct clinical studies. Without a centralised repository to provide consistent access, analytical methods are often shared with CROs via email or even by physical documents, and not in a standard format but using an inconsistent terminology. This leads to unnecessary variability and several versions of the same analytical protocol. This makes it very challenging for a CRO to re-establish and revalidate methods without a labour-intensive process that is open to human interpretation and thus error.

To tackle issues like this, the Pistoia Alliance launched the Methods Hub project. The project aims to overcome the issue of reproducibility by digitising methods data across different life science functions, and ensuring data is FAIR (Findable, Accessible, Interoperable, Reusable) from the point of creation. This will enable seamless and secure sharing within the R&D ecosystem, reduce experiment duplication, standardise formatting to make data machine-readable, and increase reproducibility and efficiency. Robust data management is also the building block for machine learning and is the stepping-stone to realising the benefits of AI.

Digitisation of paper-based processes increases the efficiency and quality of methods data management. But it goes beyond manually keying in method parameters on a computer or using an Electronic Lab Notebook; A digital and automated workflow increases efficiency, instrument usages and productivity. Applying a shared data standards ensures consistency and interoperability in addition to fast and secure transfer of information between stakeholders.

One area that organisations need to address to comply with FAIR principles, and a key area in which the Methods Hub project helps, is how analytical methods are shared. This includes replacing free-text data capture with a common data model and standardised ontologies. For example, in a High-Performance Liquid Chromatography (HPLC) experiment, rather than manually typing out the analytical parameters (pump flow, injection volume, column temperature etc. etc.), the scientist will simply download a method which will automatically populate the execution parameters in any given Chromatographic Data System (CSD). This not only saves time during data entry, but the common format eliminates room for human interpretation or error.

Additionally, creating a centralised repository like the Methods Hub in a vendor-neutral format is a step towards greater cyber-resiliency in the industry. When information is stored locally on a PC or an ELN and is not backed up, a single cyberattack can wipe it out instantly. Creating shared spaces for these notes via the cloud protects data and ensures it can be easily restored.

A proof of concept (PoC) via the Methods Hub project was recently successfully completed to demonstrate the value of methods digitisation. The PoC involved the digital transfer via cloud of analytical HPLC methods, proving it is possible to move analytical methods securely between two different companies and CDS vendors with ease. It has been successfully tested in labs at Merck and GSK, where there has been an effective transfer of HPLC-UV information between different systems. The PoC delivered a series of critical improvements to methods transfer that eliminated the manual keying of data, reduces risk, steps, and error, while increasing overall flexibility and interoperability.

The Alliance project team is now working to extend the platforms functionality to connect analytical methods with results data, which would be an industry first. The team will also be adding support for columns and additional hardware and other analytical techniques, such as mass spectrometry and nuclear magnetic resonance spectroscopy (NMR). It also plans to identify new use cases, and further develop the cloud platform that enables secure methods transfer.

If industry-wide data standards and approaches to data management are to be agreed on and implemented successfully, organisations must collaborate. The Alliance recognises methods data management is a big challenge for the industry, and the aim is to make Methods Hub an integral part of the system infrastructure in every analytical lab.

Tackling issues such as digitisation of methods data doesnt just benefit individual companies but will have a knock-on effect for the whole life sciences industry. Introducing shared standards accelerates R&D, improves quality, and reduces the cost and time burden on scientists and organisations. Ultimately this ensures that new therapies and breakthroughs reach patients sooner. We are keen to welcome new contributors to the project, so we can continue discussing common barriers to successful data management, and work together to develop new solutions.

Dr Birthe Nielsen is the Pistoia Alliance Methods Database project manager

View original post here:
Tackling the reproducibility and driving machine learning with digitisation - Scientific Computing World