Scope.

Build a live data base of GHG resources with the help of an ecosystem of Orchestrated AI robots, a dedicated team and clients community support, and offer the content, statistics and the internal instruments to clients so they can accelerate their own data collection processes, validate information, correctly allocate resources and participate to the build of the database.

Architecture Database Architecture

The Data Base will be structured based on resources. Resources representation will capture information about carbon foot print, physical proprieties, service description, usage (as in other resources, what processes, what clients, what allocation scope), sources of information, collection paths, credibility score, cost of collection, version control, validation sources, last update

System Architecture

The core of the system is built around an AI Orchestrator. This Orchestrator will coordinate the activity of an ever increasing number of specialized AI robots.

Data integration robot. This robot will collect data from known certified public databases and data sources

Data discovery robot. This robot will collect information searching the information available on internet. He will be able to crawl from data source to data source so he can identify the root source of information.

Data collection robot. This robot will collect data from company internal documents. It will extract information starting from chart of account, accounting records, invoices, contracts, bills, technical proposals, goods receptions, feasibility studies, technical documentations

Iterative discovery robot. In case that the information can’t be discovered at the first source of information this robot will break the object or service in components and identify the sources for those components. The process will be repeated until the information can be extracted with the help of other specialized robots or the initial manufacturer or service provider is identified so he can be contacted.

Data correlation robot. This robot will analyze the data collection journals and build correlations regarding the internal links for composite resources, build a trusted versioned chain of the life cycle of a resource (form creation to ecological destruction). Will analyze the usage of resources from the perspective of clients, processes activities, etc

Data quality robot. This robot will calculate scores for the credibility of information associated with a resource based on the credibility of internal and external information used to collect the data, based on the number of sources, the depth (how far up in the supply chain) of the source and the usage of the information by the clients.

Data validation robot. This robot will crosscheck internal information between similar/equivalent composite or simple resources and trigger flags in case of discrepancies. Also will assist in mitigation with support from other robots, dedicated team and clients community.

Resource allocation robot. This robot based on information gathered and aggregated by other robots, client inputs, resource usage, similar resources usage, data collection channel and sources will automatically allocate resources on the proper emission scope.

Industry specific robots. As the platform will grow, specialized robots will be built for different industries based on the main sources of GHG and industry specific data.

For example, a robot specialized in banking will know how to analyze client credit portfolio and scoring based on documents collected from the clients.

A robot specialized in retail will focus mainly in data extraction from supply chain and will also assist in offering an interface for suppliers where they can input their own data.

On our roadmap there are robots for: banking, retail, naval transportation, facility, oil and gas, energy, constructions, etc

Utility tools and robots. The system will have a subset of tools to connect the most common client systems, to extract information form most type of documents and images.

DoqAI. Is at the forefront of our data extraction capabilities lies a cutting-edge machine learning model, intricately crafted with the power of Natural Language Processing (NLP). Leveraging this technology, our system seamlessly handles a diverse range of document types. For structured documents like receipts and invoices, our solution employs advanced pattern recognition and data mapping algorithms, ensuring precise extraction of relevant information. When it comes to unstructured documents, the NLP model excels in understanding and extracting key details from seemingly chaotic data, providing a level of adaptability crucial for a comprehensive GHG resource database. Even handwritten documents pose no challenge, thanks to our model's sophisticated handwriting recognition capabilities. Through continuous learning and refinement, our technology not only meets but anticipates the evolving demands of data extraction, ensuring unparalleled accuracy and efficiency in populating the GHG resource database.

System integrations: The system will be integrated with all our EmissionX platforms exchanging data bidirectional and offering support for data collection. Also will be integrated with available public free and subscription based sources of information.

Another part of the system is represented by a collection of tools offered to clients.

Client Interactions

The system will expose interfaces for clients where they can define new resources or build their own resources based on existing information in the database.

Clients will be able to use existing robots to speedup data collection using the generic AI robot or an industry specific robot.

Clients will receive automatic notifications based on their type of subscription for: new version of information, quality of data changes, new correlations, new data usages, other user requests, etc

Clients can export data trough dedicated API interfaces Clients will benefit from an internal reporting platform.

Also clients will have a platform for exchanging information and share knowledge and GHG resources for free or license based.



Robot

Model

TRL

AI Orchestrator


1

Integration


4

Discovery


2

Collection


8

Iterative


1

Correlation


3

Quality


3

Validation


4

Allocation


6

Extraction

DocAI;

9

Industry Specific


8