• Motivation:
    • with GIP 4: MySQL backed job config store Gobblin yarn instance will pick up any new job from the metadata, there is no way to have common metadata from which multiple different instance can run.
    • This is required in an environment where there are multiple gobblin installations.
  • Proposed Change
    • introduce namespace
    • while starting gobblin it can specify which namespace it wants to operate on, if not specified it defaults to "default" namespace.
  • New or Changed Public Interfaces:
    • no change to any interface or CLIs
  • Migration Plan and Compatibility
    • This feature must be fully backward compatible.
  • Rejected Alternatives:
    • none



Here is high level diagram showcasing how it would work


Gobblin Table changes
CREATE TABLE `gobblin_deployment` (
  `deployment_id` varchar(255) NOT NULL DEFAULT '' COMMENT 'name-env-version',
  `name` varchar(255) NOT NULL DEFAULT '',
  `env` varchar(255) NOT NULL DEFAULT '',
  `version` float DEFAULT NULL,
  `status` varchar(255) NOT NULL DEFAULT '' COMMENT '"STARTED", "RUNNING", "STOPPED"',
  `configs` varchar(1024) NOT NULL DEFAULT '',
  `start_time` timestamp NULL DEFAULT CURRENT_TIMESTAMP,
  `update_time` timestamp NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
  PRIMARY KEY (`deployment_id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;


CREATE TABLE `gobblin_job_queue` (
  `queue_id` varchar(255) NOT NULL,
  `job_name` varchar(255) NOT NULL,
  `configs` text,
  `status` varchar(255) NOT NULL,
  `job_id` varchar(255) DEFAULT NULL,
  `created_date` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP,
  `updated_date` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
  `for_deployment_id` varchar(255) DEFAULT NULL,
  `failure_exception` varchar(255) DEFAULT NULL,
  PRIMARY KEY (`queue_id`),
  KEY `job_name` (`job_name`),
  CONSTRAINT `gobblin_job_queue_ibfk_1` FOREIGN KEY (`job_name`) REFERENCES `gobblin_job` (`name`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8;



  • No labels