problem: have increment x1 , x2 variable should done separate threads , next increment of both variables should not called until previous increment of both variable not completed.
proposed solution: initialize 4 semaphore , invoke separate threads separate increment of variable. 2 semaphores passing message threads start incrementing , 2 semaphores passing message main thread incrementation completed. main thread wait semaphore posting both child threads showing incrementation of both variable done, main thread pass message both child threads allowing further incrementing.
this 1 working fine me. but, can 1 one suggest better solution? or, can point out problem in solution? appreciated? in advance.
solution code:
#include <stdio.h> #include <pthread.h> #include <semaphore.h> //threads pthread_t pth1,pth2; //values calculate int x1 = 0, x2 = 0; sem_t c1,c2,c3,c4; void *threadfunc1(void *parm) { (;;) { x1++; sem_post(&c1); sem_wait(&c3); } return null ; } void *threadfunc2(void *parm) { (;;) { x2++; sem_post(&c2); sem_wait(&c4); } return null ; } int main () { sem_init(&c1, 0, 0); sem_init(&c2, 0, 0); sem_init(&c3, 0, 0); sem_init(&c4, 0, 0); pthread_create(&pth1, null, threadfunc1, "foo"); pthread_create(&pth2, null, threadfunc2, "foo"); sem_wait(&c1); sem_wait(&c2); sem_post(&c3); sem_post(&c4); int loop = 0; while (loop < 8) { // iterated step loop++; printf("initial : x1 = %d, x2 = %d\n", x1, x2); sem_wait(&c1); sem_wait(&c2); printf("final : x1 = %d, x2 = %d\n", x1, x2); sem_post(&c3); sem_post(&c4); } sem_wait(&c1); sem_wait(&c2); sem_destroy(&c1); sem_destroy(&c2); sem_destroy(&c3); sem_destroy(&c4); printf("result : x1 = %d, x2 = %d\n", x1, x2); pthread_cancel(pth1); pthread_cancel(pth2); return 1; }
instead of having bunch of threads x1 things, pausing them, having bunch of threads x2 things, consider threadpool. threadpool bunch of threads sit idle until have work them do, unpause , work.
an advantage of system uses condition variables , mutexes rather semaphores. on many systems, mutexes faster semaphores (because more limited).
// task abstract class describing "something can done" // can put in work queue class task { public: virtual void run() = 0; }; // made more object oriented if desired... example. // work queue struct workqueue { std::vector<task*> queue; // must hold mutex access queue bool finished; // if set true, threadpoolrun starts exiting pthread_mutex_t mutex; pthread_cond_t haswork; // condition signaled if there may more pthread_cond_t donewithwork; // condition signaled if work queue may empty }; void threadpoolrun(void* queueptr) { // argument threadpoolrun workqueue* workqueue& workqueue= *dynamic_cast<workqueue*>(queueptr); pthread_mutex_lock(&workqueue.mutex); // precondition: every time start while loop, have have // mutex. while (!workqueue.finished) { // try work. if there none, wait until signals haswork if (workqueue.queue.empty()) { // empty. wait until thread signals there may work // before do, signal main thread queue may empty pthread_cond_broadcast(&workqueue.donewithwork); pthread_cond_wait(&workqueue.haswork, &workqueue.mutex); } else { // there work done. grab task, release mutex (so // other threads can things work queue), , start working! task* mytask = workqueue.queue.back(); workqueue.queue.pop_back(); // no 1 else should start task pthread_mutex_unlock(&workqueue.mutex); // other threads can @ queue, take our time // , complete task. mytask->run(); // re-acquire mutex, have @ top of while // loop (where need check workqueue.finished) pthread_mutex_lock(&workqueue.mutex); } } } // can define bunch of tasks particular problem class task_x1a : public task { public: task_x1a(int* indata) : mdata(indata) { } virtual void run() { // calculations on mdata } private: int* mdata; }; class task_x1b : public task { ... } class task_x1c : public task { ... } class task_x1d : public task { ... } class task_x2a : public task { ... } class task_x2b : public task { ... } class task_x2c : public task { ... } class task_x2d : public task { ... } int main() { // bet thought you'd never here! static const int numberofworkers = 4; // tends either number of cpus // or cpus * 2 workqueue workqueue; // create workqueue shared threads pthread_mutex_create(&workqueue.mutex); pthread_cond_create(&workqueue.haswork); pthread_cond_create(&workqueue.donewithwork); pthread_t workers[numberofworkers]; int data[10]; (int = 0; < numberofworkers; i++) pthread_create(&pth1, null, &threadpoolrun, &workqueue); // of workers sitting idle, ready work // give them x1 tasks { task_x1a x1a(data); task_x1b x1b(data); task_x1c x1c(data); task_x1d x1d(data); pthread_mutex_lock(&workqueue.mutex); workqueue.queue.push_back(x1a); workqueue.queue.push_back(x1b); workqueue.queue.push_back(x1c); workqueue.queue.push_back(x1d); // we've queued bunch of work, have signal // workers work available pthread_cond_broadcast(&workqueue.haswork); // , wait until workers finish while(!workqueue.queue.empty()) pthread_cond_wait(&workqueue.donewithwork); pthread_mutex_unlock(&workqueue.mutex); } { task_x2a x2a(data); task_x2b x2b(data); task_x2c x2c(data); task_x2d x2d(data); pthread_mutex_lock(&workqueue.mutex); workqueue.queue.push_back(x2a); workqueue.queue.push_back(x2b); workqueue.queue.push_back(x2c); workqueue.queue.push_back(x2d); // we've queued bunch of work, have signal // workers work available pthread_cond_broadcast(&workqueue.haswork); // , wait until workers finish while(!workqueue.queue.empty()) pthread_cond_wait(&workqueue.donewithwork); pthread_mutex_unlock(&workqueue.mutex); } // @ end of of work, want signal workers should // stop. setting workqueue.finish true, signalling them pthread_mutex_lock(&workqueue.mutex); workqueue.finished = true; pthread_cond_broadcast(&workqueue.haswork); pthread_mutex_unlock(&workqueue.mutex); pthread_mutex_destroy(&workqueue.mutex); pthread_cond_destroy(&workqueue.haswork); pthread_cond_destroy(&workqueue.donewithwork); return data[0]; }
major notes:
- if have more tasks cpus, making threads more bookkeeping cpu. threadpool accepts number of tasks, , works on them efficient number of cpus possible.
- if there way more work cpus (like 4 cpus , 1000 tasks), system very efficient. mutex lock/unlock cheapest thread synchronization short of lockfree queue (which way more work worth). if have bunch of tasks, grab them 1 @ time.
- if tasks terribly tiny (like increment example above), can modify threadpool grab multiple tasks @ once, work them serially.