Introduction

RedGrease is a Python client and runtime package attempting to make it as easy as possible to create and execute RedisGears functions on Redis engines with the RedisGears Module loaded.

_images/RedisGears_simple.png

Overview

RedGrease makes it easy to write concise but expressive Python functions to query and/or react to data in Redis in realtime. The functions are automatically distributed and run across the shards of the Redis cluster (if any), providing an excellent balance of performance of distributed computations and expressiveness and power of Python.

It may help you create:

  • Advanced analytical queries,

  • Event based and streaming data processing,

  • Custom Redis commands and interactions,

  • And much, much more…

… all written in Python and running distributed ON your Redis nodes.

The Gear functions may include and use third party dependencies like for example numpy, requests, gensim or pretty much any other Python package distribution you may need for your use-case.

If you are already familiar with Redis and RedisGears, then you can jump directly to the What is RedGrease? overview or the Quickstart Guide, otherwise you can read on to get up to speed on these technologies.

What is Redis?

Redis is a popular in-memory data structure store, used as a distributed, in-memory, key–value database, cache and message broker, with optional durability, horizontal scaling and high availability. Redis supports different kinds of abstract data structures, such as strings, lists, maps, sets, sorted sets, HyperLogLogs, bitmaps, streams, and spatial indexes. The project is developed and maintained by RedisLabs. It is open-source software released under a BSD 3-clause license.

What is Redis Gears?

RedisGears is an official extension module for Redis, also developed by RedisLabs, which allows for distributed Python computations on the Redis server itself.

From the official RedisGears site:

“RedisGears is a dynamic framework that enables developers to write and execute functions that implement data flows in Redis, while abstracting away the data’s distribution and deployment. These capabilities enable efficient data processing using multiple models in Redis with infinite programmability, while remaining simple to use in any environment.”

When the Redis Gears module is loaded onto the Redis engines, the Redis engine command set is extended with new commands to register, distribute, manage and run so called Gear Functions, written in Python, across across the shards of the Redis database.

Client applications can define and submit such Python Gear Functions, either to run immediately as ‘batch jobs’, or to be registered to be triggered on events, such as Redis keyspace changes, stream writes or external triggers. The Redis Gears module handles all the complexities of distribution, coordination, scheduling, execution and result collection and aggregation, of the Gear Functions.

_images/Gear_Function6_white.png

Redis Gears Processing Pipeline Overview

What are Gear Functions?

Gear Functions are composed as a sequence of steps, or operations, such as for example Map, Filter, Aggregate, GroupBy and more.

These operations are parameterized with Python functions, that you define according to your needs.

The steps / operations are ‘piped’ together by the Redis Gears runtime such that the output of of one step / operation becomes the input to the subsequent step / operation, and so on.

The first step / operation of any Gear Function is always one of six available “Readers”, defining the source of the input to the first step / operation:

Readers can be parameterized to narrow down the subset of data it should operate on, for example by specifying a pattern for the keys or streams it should read.

Depending on the reader type, Gear Functions can either be run immediately, on demand, as batch jobs or in an event-driven manner by registering it to trigger automatically on various types of events.

Each shard of the Redis Cluster executes its own ‘instance’ of the Gear Function in parallel on the relevant local shard data, unless explicit collected, or until it is implicitly reduced to its final global result at the end of the function.

You can find more details about the internals of Gear Functions in the official Documentation.

What is RedGrease?

The RedGrease package provides a number of functionalities that facilitates writing and executing Gear Functions:

  1. Redis / Redis Gears client(s).

    Extended versions of the redis Python client and redis-py-cluster Python client clients, but with additional pythonic functions, mapping closely (1-to-1) to the Redis Gears command set (e.g. RG.PYEXECUTE, RG.GETRESULTS, RG.TRIGGER, RG.DUMPREGISTRATIONS etc), outlined in the official Gears documentation.

    import redgrease
    
    gear_script = ... # Gear function string, a GearFunction object or a script file path.
    
    rg = redgrease.RedisGears()
    rg.gears.pyexecute(gear_script)  # <-- RG.PYEXECUTE
    
  2. Runtime functions wrappers.

    The RedisGears server runtime environment automatically loads a number of special functions into the top level scope (e.g. GearsBuilder, execute(), log() etc). RedGrease provides placeholder versions that provide docstrings, auto completion and type hints during development, and does not clash with the actual runtime.

    _images/basic_usage_hints.jpg
  3. Server-side Redis commands.

    Allowing for most Redis (v.6) commands to be executed in the server-side function, against the local shard, as if using a Redis ‘client’ class, instead of explicitly invoking the corresponding command string using execute(). It is basically the redis Python client, but with redis.Redis.execute_command() rewired to use the Gears-native redgrease.runtime.execute() instead under the hood.

    import redgrease
    
    
    # This function runs **on** the Redis server.
    def download_image(annotation):
        img_id = annotation["image_id"]
        img_key = f"image:{img_id}"
        if redgrease.cmd.hexists(img_key, "image_data"):  # <- hexists
            # image already downloaded
            return img_key
        redgrease.log(f"Downloadign image for annotation: {annotation}")
        image_url = redgrease.cmd.hget(img_key, "url")  # <- hget
        response = requests.get(image_url)
        redgrease.cmd.hset(img_key, "image_data", bytes(response.content))  # <- hset
        return img_key
    
    
    # Redis connection (with Gears)
    connection = redgrease.RedisGears()
    
    # Automatically download corresponding image, whenever an annotation is created.
    image_keys = (
        redgrease.KeysReader()
        .values(type="hash", event="hset")
        .foreach(download_image, requirements=["requests"])
        .register("annotation:*", on=connection)
    )
    
  4. First class GearFunction objects.

    Inspired by the “remote builders” of the official redisgears-py client, but with some differences, eg:


    import redgrease
    
    
    def schedule(record):
        status = record.value.get("status", "new")
        redgrease.log(f"Scheduling '{status}' record: {record.key}")
        if status == "new":
            record.value["status"] = "pending"
            redgrease.cmd.hset(record.key, "status", "pending")
            redgrease.cmd.xadd("to_be_processed", {"record": record.key})
        ...
        return record
    
    
    def process(item):
        redgrease.log(f"processsing {item}")
        success = len(item["record"]) % 3  # Mock processing
        redgrease.cmd.hset(item["record"], "status", "success" if success else "failed")
    
    
    def has_status(status):
        return lambda record: record.value.get("status", None) == status
    
    
    key_pattern = "record:*"
    
    records = redgrease.KeysReader().records(type="hash")
    
    record_listener = records.foreach(schedule).register(key_pattern, eventTypes=["hset"])
    
    get_failed = records.filter(has_status("failed"))
    
    count_by_status = (
        records.countby(lambda r: r.value.get("status", "unknown"))
        .map(lambda r: {r["key"]: r["value"]})
        .aggregate({}, lambda a, r: dict(a, **r))
        .run(key_pattern)
    )
    
    process_records = (
        redgrease.StreamReader()
        .values()
        .foreach(process, requirements=["numpy"])
        .register("to_be_processed")
    )
    
    server = redgrease.RedisGears()
    
    # Different ways of executing
    server.gears.pyexecute(record_listener)
    process_records.on(server)
    
    failed = get_failed.run(key_pattern, on=server)
    count = count_by_status.on(server)
    
  5. A Command Line Tool.

    Helps running and/or loading of Gears script files onto a RedisGears instance. Particularly useful for “trigger-based” CommandReader Gears.

    It also provides a simple form of ‘hot-reloading’ of RedisGears scripts, by continuously monitoring directories containing Redis Gears scripts and automatically ‘pyexecute’ them on a Redis Gear instance if it detects modifications.

    The purpose is mainly to streamline development of ‘trigger-style’ Gear scripts by providing a form of hot-reloading functionality.

    redgrease --server 10.0.2.21 --watch scripts/
    
  6. A bunch of helper functions and methods for common boilerplate tasks.

Example Use-Cases

The possible use-cases for Redis Gears, and subsequently RedGrease, is virtually endless, but some common, or otherwise interesting use-cases include:

  • Automatic Cache-miss handling.

    Make Redis automatically fetch and cache the requested resource, so that clients do not have to handle cache-misses.

  • Automatic batched write-through / write-behind.

    Make Redis automatically write back updates to slower, high latency datastore, efficiently using batch writes. Allowing clients to write high velocity updates uninterrupted to Redis, without bothering with the slow data store.

    _images/Gears_Example_2_white.png

    Write-Through / Write-Behind example

  • Advanced Data Queries and Transforms.

    Perform “Map-Reduce”-like queries on Redis datasets.

  • Stream event processing.

    Trigger processes automatically when data enters Redis.

  • Custom commands.

    Create custom Redis commands with arbitrarily sophisticated logic, enabling features to virtually any platform with a Redis client implementation.

Glossary

Gear Function

Gear Function, written as two separate words, refer to any valid Gear function, as defined in the Redis Gears Documentation, regardless if it was constructed as a pure string, loaded from a file, or programmatically built using RedGrease’s GearFunction constructors.

GearFunction

GearFunction, written as one word, refers specifically to RedGrease objects of type redgrease.GearFunction.

These are constructed programmatically using either redgrease.GearsBuilder, any of the Reader classes such as redgrease.KeysReader, redgrease.StreamReader, redgrease.CommandReader etc, or function decorators such as redgrease.trigger and so on.

It does not refer to Gear Functions that are loaded from strings, either explicitly or from files.


Courtesy of : Lyngon Pte. Ltd.