Robot Drivers

  • The naming pattern of a Driver is:

    <Noun>[_Noun].<Verb>

    • <Noun> is the Resource Type e.g. “vm”

    • [_Noun] is the optional hardware e.g. “_kvm”

    • <Verb> is the Timeout state that needs to be executed for the Resource e.g.

  • Each Driver functions independently and they do not communicate with each other.

List of Supported Drivers and Verbs

Driver

Verbs

backup_hyperv

build

scrub

2

backup_kvm

build

scrub

2

ceph

build

scrub

updatequiesced

updaterunning

4

gpu

updatequiesced

updaterunning

2

snapshot_hyperv

build

scrub

updaterunning

3

snapshot_kvm

build

scrub

updaterunning

3

virtual_router

build

quiesce

restart

scrub

scrubprep

updatequiesced

updaterunning

7

virtual_router_phantom

build

quiesce

restart

scrub

scrubprep

updatequiesced

updaterunning

7

vm_hyperv

build

quiesce

restart

scrub

scrubprep

updatequiesced

updaterunning

7

vm_kvm

build

quiesce

restart

scrub

scrubprep

updatequiesced

updaterunning

7

vm_phantom

build

quiesce

restart

scrub

scrubprep

updatequiesced

updaterunning

7

51

Driver Verb File Structure

In a Driver, each verb from the above table for the relevant driver is a python file e.g. The “backup_hyperv” Driver will have build.py and scrub.py

verb.py

DRIVER = 'driver_verb'
LOGGER = logging.getLogger(__name__)


# Dispatcher: Adds the Task in default Celery Queue of RabbitMQ database
def dispatch(resource_id: int):
   # Dispatches a celery task to verb the specified Resource

   # log a message about the dispatch, and pass the request to celery
   logging.getLogger(f'{LOGGER}.dispatch').debug(f'Passing {RESOURCE} #{resource_id} to the build task queue.')
   verb.delay(resource_id)


# Task: Serves the Task from default Celery Queue in RabbitMQ database
@shared_task(name="driver_verb")
def verb(resource_id: int):
   # Helper function that wraps the actual task in a span, meaning we don't have to remember to call .finish
   logger = logging.getLogger(f'{LOGGER}')
   logger.info(f'Worker has picked up the task for verb of {RESOURCE} #{resource_id}')

   span = TRACER.start_span(LOGGER)
   span.set_tag('resource', resource_id)
   _verb(resource_id, span)
   span.finish()
   # Flush the loggers here so it's not in the span
   flush_logstash()


def _verb(resource_id: int, span: Span):
   # Task to build the specified resource
   logger = logging.getLogger(f'{LOGGER}.task')
   logger.info(f'Commencing verb of {RESOURCE} #{resource_id}')

   # Read the resource from IaaS

   # Ensure response status_code is 200

   # Add 'errors' key to the repsonse_data of the resource
   resource_data['errors'] = []

   # Ensure Prerequisites returns True
   proceed, msg = _prerequisutes(resource_data, span)

   # Update the resource state to In Progress state for the verb and pass the data to the executor

   success: bool = False

   try:
       success = _execute(resource_data, child_span)
   except Exception as err:
       error = f'{DRIVER}_1000: An unexpected error occurred when attempting to verb {RESOURCE} #{resource_id}'
       logger.error(error, exc_info=True)
       resource_data['errors'].append(f'{error} Error: {err}')

   if success:
       logger.info(f'Successfully verb {RESOURCE} #{resource_id}')

       # Update the resource state to Stable state for the verb

       # Ensure response status_code is 200

       # If required, send emails e.g. VM, VPN
   else:
       # Update state to UNRESOURCED in the API


def _prerequisutes(resource_data, span: Span)
   proceed, msg = True, ''

   # Ensure that the state of the resource is still currently expected state

   # Ensure, if any, all dependecies are met

   return proceed, msg


def _execute(resource_data: Dict[str, Any], span: Span) -> bool:

   # Execute required primitives to verb driver

   return True

Driver Error codes

  • Error Numbering System for Drivers is <driver>_<verb>_xxxx

Code

Category

0xxx

API Interactions Errors

1xxx

Internal Interactions Errors

2xxx

Primitive Interactions Errors

  • Examples:

    • backup_hyperv_build_0000: API endpoint not available

    • backup_hyperv_build_0001: Resource not in expected requested state

    • backup_hyperv_build_1001: Failed to build Backup for resource #10.

    • backup_hyperv_build_2000: Primitive does not exist

    • backup_hyperv_build_2001: Primitive returned error message while reporting success

Driver Resource Locking Mechanism

  • Drivers uses CloudCIX’s Jerry Locks mechanism, which is available through CloudCIX’s Python SDK hosted on PyPI.

  • Jerry Locks implements Peterson’s Mutex (Mutual Exclusive) algorithm

  • When a Driver locks a Resource’s physical infrastructure, no other Driver can use the same physical infrastructure.

  • Drivers lock physical infrastructure according to the following flow-diagram:

../_images/LockTransactions.png