Jump to content


Check out our Community Blogs





- - - - -

PHP Based Job Scheduler

Posted by rhossis, 09 July 2014 · 7273 views

php job scheduler
The requirement could not have been any clear "I want a native PHP application that does something similar to Obsidian Job Scheduler" (Guess you're not signing my leave form boss?). This was a new and direct requirement we got earlier this year. Scheduled jobs in PHP were being performed via cronjobs which perform http calls. No problem here, unless you want a little bit more fault tolerance and efficiency IMO in the Job Scheduler. Even Drupal 7 uses cronjob by calling wget.

A Job Scheduler should support the following. Points 1 to 6 are taken from the Wikipedia entry for Job Scheduler.
  • Interfaces which help to define workflows and/or job dependencies
  • Automatic submission of executions
  • Interfaces to monitor the executions
  • Priorities and/or queues to control the execution order of unrelated jobs
  • Real-time scheduling based on external, unpredictable events
  • Automatic restart and recovery in event of failure
  • Alerting and notification to operations personnel
  • Generation of incident reports
  • Audit trails for regulatory compliance purposes
I recently read a software article on a developers forum about taking on challenges you shouldn't. Basically the moral was to ask your self "Can I? and then Should I?" For me this was really raising a lot of doubts in thte 'Can I?' situation. At the same time, on installing and playing around witih Obsidian, and compared it to our cronjob based scheduler, I could see why the scheduler may be useful. In some situation, as important as running the job at a particular time, is knowing the state of the job and then knowing what to do based on the state of the job. It is a simple concept, which could still be achieved with the cron based scheduler. However, just like all applications could still be achieved with combination of iterative, decision and loop statements, I have come to learn that in the long run you really want a tool that is fit for purpose. No stuff of "These are common concepts any implementation could do it". Engine parts are common concepts in cars, but are they based on what you want the car to do?...."

One scenario for our job scheduling is the update of End of day stock exchange Equity prices for our client. The industry standard where they are based is to use a price called the closing value weighted average price of the day ... will not get into what that means on codecall :) , which they only publish on spreadsheet to mailing lists. If they were using closing price, we could just pull that off the online data feed. However, the steps in this process are: download file from email automatically, extract data from excel, upload data to database. Three easy steps, but this file is 1MB, which, given latency at the wrong time, may only partially download. Furthermore, they provide the file in a binary format (Excel), making it more difficult to validate on the fly as you download it.

One of the solutions was to add download service function and a extract function to the cron. So, when the cron establishes that the file is downloaded, it can then do the extract, otherwise it will continue trying to download. It has to check regularly because the stock exchange closes at 3pm. The pricelist is made available any time between 3.30PM and 6PM, based on what we have noticed.

I think I could now see why such a requirement was made. Cron was working fine, but over time, if you had such dicey jobs, something could go array. And cost the client money. I am glad that the client could foresee this. 90% of the compliance applications on the site are run on PHP, so onto the R & D:

From my research the below were ready to use tools which we could apply to a PHP based application
1) Gearman (Job Queue application)
2) JobScheduler

Gearman would be 100% perfect fit to start from, like the girl next door whom everything you learn about basically seems to assert your feelings that she is right for you, until the oops moment. Getting it to work in a Windows environment requires Cygwin, and the Infrastructure guys indicted they were restricted by governance policy on providing a case for that setup. Rather than make the platform the problem, a native PHP scheduler would eliminate that. JobScheduler will execute your scripts but really cannot monitor the "Job States" on the business end of the PHP application.

That was the challenge. In the next couple of days, I will highlight the implementation process. Trying out Obsidian was really helpful, and was good to see that it supports scripting, including Python scripting through a Jython interface. Can't help but wonder if they could do a Quercus interface for it which will then support php scripting. See you in a couple of days now. The World Cup game between Argentina and Netherlands is now getting dicey :D

  • 0



I always find it interesting when I get "requirements" from a customer/boss/whomever and my first response is, "I'm not sure this makes sense," followed by, "I don't think your proposed solution is the best one." My first thought on reading the above was, "Why does it have to be in PHP?" I'd have steered towards a .NET solution, since you're in a Windows environment already and especially since you're dealing with Excel.

 

I'm assuming you have no linux boxes/VMs :)

    • 0

Good question. This solution is distributed across some other clients, some of whom insist on running Linux boxes. Thus without the ability to call the platform, we have to do platform independent. IMHO I agree that this proposed solution is not the best. There are ready tools out there for each platform. I think part of the motivation is kind of using the project as a proof of concept to see if we fall flat. Most modern job schedulers are event driven, and with PHP having frameworks such as React and using tools like ZeroMQ for IPC, might it be time for PHP to step to the big stage too?  :)

    • 0

Well, a service/daemon written in C/C++ comes to mind :)

    • 0

And integrating it with PHP, that would be via a compiled PHP extension?

    • 0

Or, alternatively, having it call the PHP code. Of course, then it could be a scheduled task.

    • 0

Yes, I agree. If it is calling PHP code, could be a sched. task ... I stumbled on this over the weekend. This seems like an implementation http://mtdowling.com...essions-in-php/ that could match the requirements. It used Gearman. Along with scheduling the jobs, additional requirements were automatic recovery e.g. if box was off or offline during a job, pausing job and pausing job until (specified time by admin) :)

    • 0

Recent Entries

Recent Comments

My Picture

0 user(s) viewing

0 members, 0 guests, 0 anonymous users

Categories