Machine Learning for Patent Analysis

Details: Category: AI for Patents

AI

Machine learning, a subset of artificial intelligence, involves training models (typically foundation models) to learn from data and make predictions or decisions without being explicitly programmed. There are various types of learning paradigms in machine learning where each paradigm has distinct characteristics and application to patent analysis.

Supervised Learning: Best for tasks where labeled data is available and necessary for training, such as patent classification, summarizing patents, and claim construction. These models use labeled datasets to learn mappings from inputs to outputs and make accurate predictions on new data. Learn More.

Unsupervised Learning: Ideal for tasks with large unlabeled datasets, like prior art searching, where the goal is to discover hidden patterns or similarities. Clustering and topic modeling can help uncover relationships between documents without the need for labeled data. Learn More.

Semi-Supervised Learning: Suitable for tasks where labeled data is scarce but can be augmented with unlabeled data, such as infringement identification. This approach leverages a small amount of labeled data to guide the learning process on a larger set of unlabeled data, improving model accuracy. Learn More.

Reinforcement Learning: Useful for optimizing workflows and decision-making processes in patent analysis, where sequential actions and resource allocation are crucial. It can dynamically adjust patent classification, optimize prior-art searching strategies, automate infringement identification, and refine claim construction based on feedback and rewards. Learn More.

Transfer Learning: Involves using pre-trained models on large datasets and fine-tuning them on smaller, related datasets, making it particularly useful for patent classification and summarization when labeled data is limited. Learn More.

Federated Learning: A decentralized approach where multiple entities collaborate to train models without sharing their data, enhancing model performance through diverse datasets while maintaining data privacy. This is ideal for collaborative patent analysis across organizations. Learn More.

Active Learning: An iterative process where the model selects the most informative samples for labeling, minimizing labeling effort while maximizing performance. It can be applied to challenging cases in patent classification, claim construction, prior-art searching, and infringement analysis to improve accuracy with fewer labeled examples. Learn More.

Supervised Learning

In supervised learning, the model is trained on a labeled dataset. Each training example is paired with an output label, and the model learns to map inputs to the corresponding outputs.

Characteristics:

Requires a large amount of labeled data for training.
The goal is to learn a mapping from inputs to outputs and make predictions on new, unseen data.
Common algorithms include: Linear regression, logistic regression, decision trees, support vector machines, and neural networks.

Application of Supervised Learning to Patent Analysis

Patent Classification

Patent classification which entails categorizing patents into technology domains (different from existing classifications such as CPC) typically relies on a pre-labeled dataset where patents are already categorized. Supervised learning models can be trained on this labeled data to accurately predict the category of new patents.

Uses algorithms like Support Vector Machines (SVM) or Neural Networks to classify patents based on labeled training data where each patent is associated with a category.

Patent Summaries

Summarization tasks benefit from supervised learning because the model needs to learn the mapping between the full text and its summary, requiring labeled pairs of patent documents and their summaries.

Uses models like Sequence-to-Sequence (Seq2Seq) with Attention Mechanisms or Transformers (e.g., BERT, GPT) that are trained on labeled datasets where summaries are provided for each patent.

Claim Construction

Claim construction, which involves construing the meaning of specific terms and phrases used in claims in light of the specification, benefits from supervised learning on labeled datasets where the meaning of claim terms has been annotated. The meaning of claim terms used for the labeled dataset can be obtained from court opinions (Markman rulings), PTAB decisions, or from experts (a POSITA and/or legal practitioner).

Uses Natural Language Processing (NLP) models like BERT or RoBERTa fine-tuned on labeled datasets where claims are annotated with their construction.

Unsupervised Learning

In unsupervised learning, the model is trained on unlabeled data. The model tries to learn the underlying structure or distribution in the data without explicit labels. Useful in clustering and dimensionality reduction tasks.

Characteristics:

Does not require labeled data.
The goal is to discover hidden patterns, groupings, or features in the data.
Common algorithms include: K-means clustering, hierarchical clustering, principal component analysis (PCA), and autoencoders.

Application of unsupervised Learning to Patent Analysis

Prior-Art Searching

Prior art searching involves large datasets where labeled data might not be readily available. Clustering and topic modeling can help uncover relationships and similarities between documents based on their content. Unsupervised learning for prior-art searching models can use clustering or topic modeling (e.g., Latent Dirichlet Allocation - LDA) to group similar documents based on textual content. This helps in discovering prior art without needing labeled data.

Semi-supervised Learning

Semi-supervised learning combines elements of both supervised and unsupervised learning. The algorithm is typically trained on a small amount of labeled data and a large amount of unlabeled data. Useful in scenarios where labeled data is expensive or time-consuming to obtain.

Characteristics:

Uses labeled and unlabeled data to improve learning efficiency.
Leverages the small labeled dataset to guide the learning process on the larger unlabeled dataset.
Common algorithms include variants of supervised learning algorithms modified to handle unlabeled data, self-training, co-training, and generative models.

Application of semi-supervised Learning to Patent Analysis

Infringement identification/ searching for licensing targets

Infringement identification often has limited labeled data due to the complexity and specificity of each case. Semi-supervised learning can leverage a small amount of labeled data along with a larger corpus of unlabeled data to improve the model's accuracy. Labeled examples of mapping between patent claims and product literature can be used to train models like semi-supervised neural networks or self-training algorithms.

Reinforcement Learning

Reinforcement learning (RL) is a type of machine learning where an agent learns to make decisions by performing actions in an environment to maximize some notion of cumulative reward. RLHF (Reinforcement Learning from Human Feedback) incorporates human feedback in the rewards function, so the ML model can perform tasks more aligned with human goals, wants, and needs.

Characteristics:

The agent interacts with the environment and receives feedback in the form of rewards or penalties.
The goal is to learn a policy that maximizes the cumulative reward over time.
Common algorithms include: Q-learning, Deep Q-Networks (DQN), and Proximal Policy Optimization (PPO).

Application of Reinforcement Learning to Patent Analysis

Patent Classification

Reinforcement learning can be used to dynamically adjust the patent classification process based on feedback. For example, a model could learn to prioritize certain sections of the patent specification that, read together with the claims, are more indicative of specific categories, thereby improving classification accuracy over time.

Prior-Art Searching

An RL agent can be trained to optimize the search strategy by exploring various paths and techniques to identify relevant prior art more efficiently. The agent receives rewards for finding high-relevance documents, leading to more effective search processes.

Infringement Identification

Similarly, RL can be applied to automate and optimize the comparison of patent claims with details regarding a target product or process. The agent can learn to focus on key features or sections of the product literature that are more likely to indicate infringement, improving the efficiency and accuracy of the identification process.

Claim Construction

Reinforcement learning can be utilized to refine the process of claim interpretation by updating the model based on new District Court, ITC, and PTAB decisions. The agent can receive rewards for accurately construing claims based on expert feedback, thereby improving its ability to interpret complex claim language over time.

Transfer Learning

Transfer learning involves taking a pre-trained model on a large dataset and fine-tuning it on a smaller, related dataset. This approach is particularly useful when there is limited labeled data available for the specific task.

Application of Transfer Learning to Patent Analysis

Patent Classification & Summarization

Pre-trained language models like BERT or GPT can be fine-tuned on patent datasets to improve classification accuracy and generate high-quality summaries, even with limited labeled data.

Learn More

Federated Learning

Federated learning is a decentralized approach where multiple entities collaborate to train a model without sharing their data. Each entity trains a local model on its own data and only shares model updates with a central server.

Different organizations (e.g., law firms, patent analytics firms, research institutions, etc.) can collaborate to train robust patent analysis models while keeping their proprietary data private. This approach enhances model performance through diverse datasets without compromising data privacy.

Active Learning

Active learning is an iterative process where the model actively selects the most informative samples for labeling. This approach minimizes the labeling effort while maximizing model performance.

Active learning can be used to identify the most challenging cases of patent classification, claim construction, prior-art or infringement anlysis, etc., for expert review. By focusing on these cases, the model can improve its accuracy with fewer labeled examples.

Summary

Kama Thuo, PLLC collaborates with various AI vendors to efficiently tackle AI-based patent analysis matters. The table below highlights a hypothetical collaboration on typical patent analysis tasks.

AI-based patent analysis
Patent Analysis Task	Legal Analysis (KTH patents & AI counseling)	Patent Analytics Vendor (Preferred: Patent Analytics, Inc)	AI Automation Vendor (Preferred: Rfwel Engineering, LLC)
Invalidity Search	Provide legal opinions on patent invalidity	Generate invalidity search reports using patent databases & AI tools	Train/fine-tune AI models to identify correct prior art and different types of NPL prior art such as system art.
Freedom-to-operate/ Clearance Search	Analyze legal risks for potential infringement and evaluate design-around modifications	Compile relevant patent data and generate clearance reports	Fine-tune AI models to streamline clearance searches
Patent Acquisition Search	Legal due diligence for patent acquisitions	Provide data on patent portfolios and potential acquisition targets	Automate patent portfolio analysis using AI and integrate workflows
Infringement Identification/ Licensing Target Search	Legal assessment of potential infringements and licensing viability	Extract and compile relevant patent and product literature	Integrate tools to search product documentation and apply to models to compare against target claims.
Patent Classification	Analyze proposed technology classes and methodology to capture claim scope and pre-sort based on existing CPCs	Generate and optimize labeling datasets for patent classification	Prepare and label patents for supervised learning models
Patent/Family Summaries	Analyze and revise generated summaries for reinforcement learning	Generate detailed patent family reports	Fine-tune NLP models to create concise patent summaries
Patent Excavation Study	Provide legal insights and implications of findings, for example, in monetization	Prepare patent portfolio for study	Enhance data mining processes with AI-driven techniques
Technology Landscape Study	Analyze and provide legal context for technology trends	Generate reports on technology trends and patent landscapes	Integrate data sources with AI-based tools to discover and analyze trend data
Claim Construction	Interpret and provide legal opinions on likely claim meanings	Gather relevant official claim constructions	Apply relevant claim constructions to fine-tune NLP models to predict reasonable construction for new terms
Claim Amendment Study/Targeted Prosecution	Legal strategy for claim amendments and continued prosecution and analyze potential reads on target products or standards	Assist with prosecution history extraction and data pre-processing; provide reports on read on products or standards	Integrate with APIs on prosecution data and other analysis AI tools

Contact us for any questions or clarifications.

Preferred support vendors:

AI Automation Vendor (Rfwel Engr AI Group) | Patent Analysis AI Tool (patanal.ai) | Patent Analytics Vendor (Patent Analytics, Inc) | Wireless Technology Consultants (Rfwel Engr WDI Research)

Machine Learning for Patent Analysis

Supervised Learning

Application of Supervised Learning to Patent Analysis

Patent Classification

Patent Summaries

Claim Construction

Unsupervised Learning

Application of unsupervised Learning to Patent Analysis

Prior-Art Searching

Semi-supervised Learning

Application of semi-supervised Learning to Patent Analysis

Infringement identification/ searching for licensing targets

Reinforcement Learning

Application of Reinforcement Learning to Patent Analysis

Patent Classification

Prior-Art Searching

Infringement Identification

Claim Construction

Transfer Learning

Application of Transfer Learning to Patent Analysis

Patent Classification & Summarization

Federated Learning

Active Learning

Summary

Other Articles

Similar Items