Opinion & Analysis

5 Must-dos Before Selecting Algorithms and Tools for AI Use Cases (Special Focus on Healthcare Revenue Cycle Management)

Selecting the right algorithm and technology for an AI project isn't a one-size-fits-all solution. It requires careful consideration of various factors.

Written by: Ravi Tenneti | VP, Global Head of AI/ML, Narwal

Updated 3:53 PM UTC, Mon November 27, 2023

In the realm of Artificial Intelligence (AI), the biggest challenge isn’t always about crafting the solution. At times, it is about selecting the right tools for the job. With an array of algorithms and technologies available, how does one make the best choice?

Let us delve into the techniques for selecting these components for any AI project, with a special spotlight on Healthcare Revenue Cycle Management (RCM).

Selecting the right algorithm and technology for an AI project isn’t a one-size-fits-all solution. It requires careful consideration of various factors. Here is a comprehensive look into the techniques for making these selections:

1. Determine problem type

Before you can pick the right tool, you need to fully understand the problem. Here’s how:

i. Classification vs. regression: Classification involves categorizing data into predefined groups or classes, while regression deals with predicting a continuous output. Are you categorizing data into classes or predicting continuous values?

As an example in Healthcare RCM, predicting whether a patient’s insurance claim will be accepted or rejected is a classification problem, whereas estimating the amount a patient will owe after insurance adjustments is a regression problem.

ii. Supervised vs. unsupervised learning: In supervised learning, algorithms learn from labeled training data, making predictions based on that input. Unsupervised learning, on the other hand, finds hidden patterns or intrinsic structures from unlabeled data. Do you have labeled data to train your model (supervised) or are you trying to find patterns in unlabeled data (unsupervised)?

As an example in Healthcare RCM, supervised learning predicts future claim outcomes like “paid” or “denied” using historical claims data, whereas unsupervised learning groups patients by their billing history to uncover hidden insights.

Reinforcement Learning: This involves algorithms that learn by taking actions in an environment to maximize some notion of cumulative reward. Are you operating in an environment where an algorithm learns by interacting?

An example scenario for this is optimizing the collection process where the algorithm interacts with different approaches (like email, SMS, or phone calls) and learns the most effective way to get payments based on the rewards (successful payments).

iii. Time series analysis: Time series analysis predicts future values based on past sequences. Is your problem centered around predicting future values based on past sequences, such as forecasting patient admissions? Forecasting the number of patient admissions for the next month based on past admission data is an example of a time series problem.

2. Consider the problem’s context

Context is king when it comes to choosing the right technology or algorithm to solve your problem.

i. Historical vs. real-time: Some applications require real-time analytics, like fraud detection, whereas others can work with batch processing. This decision influences the tools and infrastructure you will need. In a real-time scenario, systems can be designed to immediately flag potentially fraudulent claims as they are submitted, allowing for rapid intervention and reducing the likelihood of processing illegitimate claims.

On the other hand, historical data provides a broader perspective. For instance, by analyzing trends in claim denials over the past year, organizations can identify and address systemic issues, helping to streamline operations and minimize recurrent mistakes.

ii. Static vs. dynamic: Will the data patterns remain largely consistent, or are they expected to change frequently? Static data refers to information that remains relatively unchanged over time, such as the demographic details of patients. This might include attributes like their date of birth, address, or contact details.

Conversely, dynamic data is characterized by its propensity to change based on various situations or over a period of time. An apt example would be the medical treatments a patient undergoes and their associated costs. You may sometimes in certain contexts see this also referred to as master data versus transactional data.

iii. Structured vs. unstructured data Will the AI be processing well-organized database data or disparate sources like images, text, and videos? Within a hospital billing system, examples of structured data include patient names, dates of birth, treatment codes, and billing amounts. This data is often stored in predefined fields, making it straightforward to query and analyze.

In contrast, unstructured data lacks a specific form or structure, making it more challenging to process and analyze using conventional database methods. Examples in healthcare include patient medical records that encompass doctor’s notes, medical images, or audio recordings. Such data might not fit neatly into standard database fields but still holds immense value for clinical decision-making and patient history documentation.

3. Assess data availability

Data is the lifeblood of machine learning and artificial intelligence. The kind and quantity of data you possess can strongly influence the algorithms you select for a project.

i. Volume: This refers to the amount or size of the data available for analysis and training. The volume can significantly influence the performance of different algorithms. Algorithms react differently to data volume. Some algorithms shine with vast datasets while others can work with smaller amounts. For instance, Deep Learning Algorithms, a subset of neural network models, often with many layers, are especially data-hungry.

They learn from vast amounts of data to capture intricate patterns and relationships. In the context of healthcare revenue cycle management, if an organization has years of detailed billing and claims data, a deep learning model could potentially uncover subtle patterns that simpler algorithms might miss.

Traditional Machine Learning Models like decision trees or logistic regression might not require as much data as deep learning models. For example, a decision tree could be used in revenue cycle management to identify common reasons for claim denials based on fewer features and data points.

ii. Variety: Variety pertains to the different types or forms of data that are available. Consider the types of data you have — structured data like databases, unstructured data like text, or semi-structured data like XML. Different algorithms are better suited to different types of data, so understanding the nature of your data can guide algorithm choice.

Algorithms such as Support Vector Machines (SVM) or Linear Regression could be applied directly to structured data such as patient demographic details or billing amounts. Unstructured data such as clinical notes or medical images do not have a predefined data model and are typically more challenging to analyze and process.

Given its complex nature, algorithms tailored to unstructured data, like deep learning models for image data or NLP (Natural Language Processing) techniques for text data, are often employed. For semi-structured data, hybrid approaches might be employed. This data type falls between structured and unstructured data.

Healthcare systems might employ XML or JSON formats to exchange data. EHRs might export data in a semi-structured format, mixing well-defined fields with free-text notes or observations. While it might not have the formal structure of a database, it does possess some organizational properties that make it more accessible than purely unstructured data.

NoSQL databases, like MongoDB, can be used to store and retrieve such data, and a combination of traditional ML techniques and NLP might be utilized to derive insights.

iii. Quality: High-quality data is pivotal for the success of an AI project. Assess the quality and look for inconsistencies, missing values, and outliers. As an example, when processing insurance claims, a missing insurance policy number or an incorrect patient identifier can result in claim rejections or delays. Data cleaning might be a necessary step before algorithm selection.

4. Evaluate the trade-off between explainability and accuracy

Sometimes, it is essential to understand how the model makes decisions, especially in sensitive areas like healthcare. In such scenarios, a simpler and more interpretable model might be preferred over a complex black-box model. The explainability vs. accuracy needs depend on the following factors:

i. Transparency: In some domains, understanding how the model arrived at a decision is critical. Algorithms like Decision Trees or Linear Regression are more interpretable than neural networks. Suppose a predictive model is used to flag potential claim denials – in that case, it becomes imperative for billing professionals to understand why a particular claim might be denied.

A decision tree might highlight that a claim was flagged because of a specific missing code, making it easier for a coder to rectify the error, whereas a neural network might just predict a claim’s likelihood of denial without such clarity.

ii. Stakeholder needs: If stakeholders require explanations, lean towards models that provide clearer insights into their decision-making process. For instance, if a model predicts that a patient is likely to default on a payment, the financial counseling team might need to understand why to provide appropriate financial advice.

A model based on linear regression might show that patients without insurance or those with high deductibles have a higher default rate. This insight can lead to more tailored patient engagement and counseling strategies.

iii. Regulations: In some sectors, especially healthcare or finance, regulations might dictate the need for interpretable models. For example, if an AI model automatically adjusts or writes off certain patient charges, regulators might need clarity on the criteria used for these adjustments.

In such cases, a more interpretable model like Logistic Regression, which can clearly delineate the weightage of each factor leading to a decision, might be preferred over a deep learning model.

5. Analyze technical complexity

Consider the complexity of your problem and the necessary computational power. It is essential to measure the relative benefits of an AI solution’s accuracy against its computational demands and infrastructure needs. Not all use cases require the most advanced algorithms. Sometimes, a simpler and more agile solution might better serve the fast-paced requirements than a marginally more accurate but resource-intensive counterpart.

i. Algorithm complexity: While complex algorithms might offer better accuracy, they often come with higher computational costs and longer training times. Let us consider the task of predicting claim denials, which is a significant aspect of RCM. While a complex algorithm like a deep neural network might theoretically produce a marginally higher accuracy rate in predictions, it also demands a substantial amount of data and can be computationally expensive.

On the other hand, simpler algorithms like Logistic Regression or Random Forests might be slightly less accurate but are quicker to train and easier to implement. For example, if the neural network offers 95% accuracy but takes 72 hours to train and requires specialized hardware, compared to a Random Forest that provides 93% accuracy and trains in just 3 hours on a standard server, the latter might be more efficient and cost-effective for the RCM process.

ii. Infrastructure considerations: Ensure that you have the necessary computational resources, especially if considering deep learning or other resource-intensive algorithms.

Say an RCM team wants to employ AI for real-time claim validation at the point of service. Using a deep learning model, though potentially accurate, might lead to slower response times due to its resource-intensive nature, thereby delaying the patient checkout process. This could lead to patient dissatisfaction.

In contrast, a simpler model might provide a quick validation check, ensuring smooth operations. For such real-time tasks, the RCM team must assess if the current computational infrastructure can support quick responses, especially if considering heavy algorithms.

Final thoughts on evaluating AI investments in healthcare

Selecting the right AI solutions goes beyond just technical considerations. It is imperative to also factor in the financial aspects. While there is an undeniable cost associated with the development, infrastructure, and ongoing maintenance of AI tools, these expenditures should be viewed against the backdrop of potential returns.

Ultimately, the goal is to ensure that any financial commitment to AI not only aligns with organizational objectives but also promises a favorable return, both in terms of monetary gains and improved outcomes. With a robust selection strategy, you can ensure that the tools and techniques you choose align perfectly with your project’s objectives.

About the Author:

Ravi Tenneti is passionate about technology strategy, leadership, and AI in Healthcare. Over the past several years, he led large-scale digital transformation initiatives and strategic technology teams in the manufacturing, semiconductor, banking and healthcare industries.

Tenneti provided strategic advisory to 10+ pre-seed and seed startups and also held leadership roles in several non-profit organizations. He holds a Computer Science degree, an MBA, and a Chief Technology Officer’s certification from Wharton Business School.

He is also an official member of Forbes Technology Council, Vation Ventures Ohio Innovation Council, and on the executive board of Fast Company magazine.