Introduction

Last updated 5 months ago

Welcome to the Macrometa Edge Fabric (aka C8 fabric) documentation!

C8 is a edge native co-ordination free data & compute platform to enable distributed, stateful apps and dynamic content at the edge.

The platform is a combination of

  1. Global database i.e., a multi-model, multi-master, decentralized, global & geo-fenced real-time database

  2. Global streams i.e., decentralized global & geo-fenced streams

  3. Compute service for deploying, orchestrating and executing containers and functions on edge

We call it edge native because it is designed to sit across 100s of worldwide locations/pops and present one global multi-master real-time data (DB & Streams) platform.

Fabric - is a term we use to refer to a collection of edge data centers linked together as a single, high performance cloud computing system consisting of storage, networking and processing functions.

  • Fabric is the collection of all edge data centers that have been provisioned.

  • Fabric is created when a tenant account is provisioned with the edge locations.

  • Fabric contains Collections, Streams, Functions and Geo Fabrics.

Geo Fabrics are subsets of the fabric and can be composed of one or more edge locations in Fabric (defined by tenant). A geo fabric contains the following services:

  • Collections - are a grouping of JSON documents and are like tables in a RDBMS. You can create any number of collections in a geo fabric. And a collection can have any number of documents.

  • Streams are a type of collection that capture data in motion. Streams support both pub-sub messaging and message queuing models. Messages are sent via streams by publishers to consumers who then do something with the message.

  • Functions (or containers): lets user package code into containers and then deploy them on geo fabrics. Once deployed, C8 orchestrates the containers to execute on demand (i.e. serverless) in edge locations in response to requests from clients. Functions are deployed using the C8FN CLI tool and optionally via the web console.

Key capabilities

C8 fabric makes it easy to build scalable, highly responsive applications at global scale while hiding the complexity of distributed and decentralized systems:

  • You can distribute your database, streams and functions to any number of regions with the click of a button. This enables you to put your data & compute where your users are, ensuring the lowest possible latency to your customers.

  • The application read & write requests are always sent automatically to the nearest data center. As you add and remove regions to your C8 fabric, your application does not need to be redeployed and continues to be highly available.

  • Multiple data models (document, graph, key-value) and popular APIs for accessing and querying data.

  • Streams - Global publish-subscribe messaging and message queuing in a unified model.

  • C8QL - Native C8 Query language with rich querying capabilities.

  • Real-time database (i.e., push based updates across multiple regions)

  • Serverless functions running close to where data resides

  • API Drivers in multiple languages - Python, Javascript, Java, C++

  • Additional data models and APIs are coming soon!

Multi-Master & Multi-Region

Developing globally distributed applications that respond with local latencies while maintaining consistent views of data worldwide is a challenging problem. Customers use globally distributed databases, because they need to improve data access latency, achieve high data availability, ensure guaranteed disaster recovery and to meet their business requirements.

Multi-master in C8 provides high levels of availability, local latencies to write data and scalability with built-in comprehensive and flexible conflict resolution support. These features significantly simplifies development of globally distributed applications. For globally distributed applications, multi-master support is crucial.

With C8 fabric multi-master support, you can perform writes on data (for example, documents, collections graphs, kv pairs) distributed anywhere in the world. You can update data in any region that is associated with your fabric account. These data updates can propagate asynchronously.

In addition to providing fast access and write latency to your data, multi-master also provides a practical solution for failover and load-balancing issues.

In short,

you can perform reads & writes on data (for example, documents, collections graphs, kv pairs) distributed anywhere in the world. The data updates can propagate asynchronously with auto-resolution of any conflicts.

The same applies for streams as well i.e., developing globally distributed real-time applications is a challenging problem. With C8 fabric, you can create global or geo-fenced streams to replicate to anywhere in the world.

In short,

you can publish data in any region that is associated with your fabric account and subscribe in any region that stream replicates to.

Natively Multi-Model

When it comes to choosing the right technology for a new project, ongoing development or a full system upgrade, it can often be challenging to define the exact right tools that will match set-up criteria from start to finish. Especially, when it comes to choosing the right database.

It has been actively discussed and debated by many experts that one size does not always fit all. This idea suggests that one would use different data models for different parts of large software architectures.

Meaning that one has to use multiple databases in the same project, potentially resulting in some operational friction, data consistency and duplication issues. This is where a native multi-model database with all its benefits and flexibility comes into play.

So what is a native multi-model database?

A native multi-model database is – simply put – a combination of several data stores in one. In a multi-model database data can be stored as key/value pairs, graphs or documents and can be accessed with one declarative query language. It is also possible to combine different models in a single query.

With a native multi-model approach you can build high-performance applications and scale horizontally using all three data models to their full extent. In comparison to a “layered approach” many vendors adapt, a native multi-model solution leads to flexibility and performance advantages.

In short,

a native multi-model database has one core, one query language, but multiple data models.

You can model your data and access collections in following ways:

Key Value interface - The key/value store data model is the easiest to scale. A regular collection always has a primary attribute and in the absence of further secondary indexes the document collection behaves like a simple key/value store.

  • The only operations that are possible in this context are single key lookups and key/value pair insertions and updates.

Document interface - The documents you can store in a regular collection closely follow the JSON format.

  • A document contains zero or more attributes with each of these attributes having a value. A value can either be an atomic type, i.e. number, string, boolean or null, or a compound type, i.e. an array or embedded document/object. Arrays and sub-objects can contain all of these types, which means that arbitrarily nested data structures can be represented in a single document.

  • Documents are grouped into collections. A collection contains zero or more documents. If you are familiar with RDBMS, then it is safe to compare collections to tables, and documents to rows.

  • In a traditional RDBMS, you have to define columns before you can store records in a table. Such definitions are also known as schemas. Collections are schema-less, and there is no need to define what attributes a document must have. Documents can have a completely different structure and still be stored together with other documents in a single collection.

  • In practice, there will be common denominators among the documents in a collection, but Macrometa itself doesn't force you to limit yourself to a certain data structure.

Graph interface - You can turn your documents into graph structures for semantic queries with nodes, edges and properties to represent and store data. A key concept of the system is the idea of a graph, which directly relates data items in the database.

  • A graph collection is simply a regular collection but with some special attributes that enable you to create graph queries and analyze the relationships between objects.

  • In SQL databases, you have the notion of a relation table to store n:m relationships between two data tables. An edge collection is somewhat similar to these relation tables; vertex collections resemble the data tables with the objects to connect.

  • While simple graph queries with fixed number of hops via the relation table may be doable in SQL with several nested joins, graph databases can handle an arbitrary number of these hops over edge collections - this is called traversal. Also edges in one edge collection may point to several vertex collections. It is common to have attributes attached to edges, i.e. a label naming this interconnection.

  • Edges have a direction, with their relations _from and _to pointing from one document to another document stored in vertex collections. In queries you can define in which directions the edge relations may be followed.

Real-time Database

When your app polls for data, it becomes slow, unscalable, and cumbersome to maintain. C8 makes building realtime apps dramatically easier. It is a great choice when your applications could benefit from realtime feeds to your data.

The query-response database access model works well on the web because it maps directly to HTTP’s request-response. However, modern applications require sending data directly to the client in realtime. Use cases that can benefit from C8 realtime push architecture include:

  • Collaborative web and mobile apps

  • Streaming analytics apps

  • Multiplayer games

  • Realtime marketplaces

  • Connected devices

For example, when a user changes the position of a button in a collaborative design app, the server has to notify other users that are simultaneously working on the same project. Web browsers support these use cases via WebSockets and long-lived HTTP connections, but adapting database systems to realtime needs still presents a huge engineering challenge.

C8 database designed specifically to push data to applications in realtime across multiple data centers. It dramatically reduces the time and effort necessary to build scalable realtime apps.

GeoFabric Streams

Streams are a type of collection in C8 fabric that capture data-in-motion. Messages are sent via streams by publishers to consumers who then do something with the message. C8Streams can be created via client drivers (PyC8), REST API or the web console.

C8Streams unifies queuing and pub-sub messaging into a unified messaging model that provides a lot of flexibility to users to consume messages in a way that is best for the use case at hand.

producer→stream→subscription→consumer

  • A stream is a named channel for sending messages. Each stream is backed by a distributed append-only log and can be local (at one edge location only) or global (across all edge locations in the Super Fabric). Similarly the streams can be persistent or non-persistent.

  • Messages from publishers are only stored once on a stream, and can be consumed as many times as necessary by consumers. The stream is the source of truth for consumption. Although messages are only stored once on the stream, there can be different ways of consuming these messages.

  • Consumers are grouped together for consuming messages. Each group of consumers is a subscription on a stream. Each consumer group can have its own way of consuming the messages—exclusively, shared, or failover.

GeoFabric Functions

GeoFabric Functions is a serverless execution environment for building and connecting services. With Geo Fabric Functions you write simple, single-purpose functions that can be executed either via http or via events emitted from GeoFabric streams and database services. Your GeoFabric Function is triggered when an event being watched is fired. Your code executes in a fully managed environment. There is no need to provision any infrastructure or worry about managing any servers.

Capability comparison

C8 fabric provides the best capabilities of traditional relational and non-relational databases and messaging systems.

Capabilities

Relational databases

NoSQL databases

Streams

C8 Geo Fabric

Global distribution

No

No

No

Yes

Horizontal scale

No

Yes

Yes

Yes

Latency guarantees

No

Yes

Yes

Bounded Latencies

High availability

No

Yes

Yes

Yes

Data model + API

Relational (SQL)

Multi-model + OSS API

Messages

Multi-model, C8QL, SQL (coming soon)

Real Time (push based) updates

No

No

Yes

Yes

Conflict Resolution

No

No

N/A

Yes

Geo Fencing

No

No

No

Yes

Geo Spatial support

No

No

N/A

Yes

Mulit-tenancy

No

No

No

Yes

Serverless Functions

No

No

No

Yes

Solutions that benefit from C8 fabric

Any web, mobile, gaming, finance, cdn, analytics and IoT applications that needs to handle massive amounts of data, reads, and writes at a scale with near-real response times for a variety of data will benefit from C8 fabric's high availability, high throughput, low latency, real-time converged database and streaming capabilities.

Next Steps

C8 fabric is an edge native co-ordination free streaming and multi-model real-time database with flexible data models for documents, graphs and key-values. Build high performance applications using a convenient SQL-like query language or JavaScript extensions. Use ACID transactions if you require them. Scale horizontally within and across regions and vertically within a region with a few mouse clicks.

Key features include:

  • Flexible data modeling: model your data as combination of key-value pairs, documents or graphs which is perfect for social relations.

  • Powerful query language: to retrieve and modify data.

  • Indexing: for various types of indexes - hash, skiplist, geo, persistent, full-text etc.

  • Graphs: for treating relationships between data as important as data itself.

  • Transactions: run queries on multiple documents or collections with optional transactional consistency and isolation.

  • Replication and Sharding: spread bigger datasets across multiple servers with built-in replication.

  • GeoFabric Streams: for low latency, high througput global pub-sub messaging and message queueing.