2

I'm running some serial commands to test connected devices under test (DUTs), & we have a need to run these in parallel/concurrently to speed up our tests. I would like some feedback on which model would be most appropriate for this use case.

We're currently using Python to run our tests, so I've been reading up on async and multithreading. These are our current requirements.

  1. Individual tests are run on a DUT. The steps within the test must be run sequentially, but multiple DUTs may be tested in parallel - this suggests that tests are the unit of work to be run in parallel
  2. Tests may use other resources (connections to servers/databases, other supporting hardware) that may be blocking/not thread safe - need to manage access to blocking resources
  3. The main source of slowdown is the issuing of commands to DUTs. Often we send a command & wait for a response, or poll the DUT until it reaches a certain state. During this time, we would be able to send commands to other DUTs/perform other computations - need to leverage concurrency between waiting for response from DUTs

Based on these requirements, I'm leaning more towards a threading model. Based on my research & experience, using async would require developing an async API for issuing commands to DUTs. Additionally, there's little need for synchronization between threads - aside from shared resources, the only thing required from parallel tests is the end result & no cross-thread communication is required.


The solution I'm thinking of looks like:

  1. A test plan can be broken down into sequential steps
  2. A test step can be broken down into parallel tests, which can be run on individual threads
  3. Shared resources can be made thread safe using locks & mutexes

I'd welcome feedback on my reasoning if I've missed out any other drawbacks or considerations, & I'd also welcome someone making the case for an async model too!

7
  • Question needs a lot of clarifications before even a high level answers may start to get drafted -- (a) where are time-stamped profiling data from as-is running the test plan (where did you come to a conclusion, what is and what is not "blocking", what can and cannot be run "concurrently" ( with what caps on "shared"-resources causing (claimed possible) blockings, that would require any form of "coordination" for CP/M validated shortest possible test-plan run-times ) and what amounts of time ( distribution of actual times ) is lost (potentially maskable) in serial-RQST / serial-RESP latency? Commented Sep 4 at 2:27
  • (b) how many serial COMx-ports can your test-plan hosting device operate on? (c) how many serial COMx-ports can your DUT device ( each ) operate "concurrently" on or what form of "terminal"-alike MUX you can successfully use error-free, so as to "enforce" the DUT run "concurent" workloads, logged on test-plan hosting device for some later decode/pretty-print from raw-form concurrently-acquired stream-of-chars, as requested above? (d) what are target numbers for N-many-DUTs tested in "parallel"; M-many-DUT-owned-COMx-ports per each device being tested in batch? Numbers matter, details too ... Commented Sep 4 at 2:33
  • a) Each test station can only test 4-8 DUTs at a time sequentially. Increasing throughput requires more test stations which limits scalability. Which tests can be run in parallel depends on the feature being tested & the design of the test. b, c and d) the goal is to test 6 units with increased concurrency, & each DUT uses 1 COMx port Commented Sep 4 at 2:43
  • Apologies if the question is light on details, but a lot of the answer is "it depends". As an example, a DUT may need to be rebooted & then checked that it is powered on, & that may take up to 5 seconds, in which time operations on a different DUT may be performed. A DUT may also need to be tested by external hardware that cannot test more than 1 at a time, as an example of a "blocking" resource. At this stage we are concerned with designing a model for running tests concurrently. Commented Sep 4 at 2:49
  • Having seen TELCO Carrier NMS/telemetry/housekeeping-alarms/security overlay infrastructures, having seen COMPAQ Computers assembly-line manufacturing QA/QC/TQM diagnostic stations running at designed assy-line cadence 24/7/365, having been EMEA PM for one of Top-10 global migration of above 25k PC-stations+POS HW world-wide (w similarly sketched test-plans for semi-automated diagnostics & upgrade), having been impressed by JPL DeepSpace network (read 300 bps in early 80-ies) re-programming Voyager-2 onboard O/S so as to collect still images of Neptune in darkness - start analysis with CP/M Commented Sep 4 at 3:41

1 Answer 1

2

Just go for threads.

For asyncio there, one would need a good understanding of how asyncio works, juggle things around, and a single badly written async-I/O operation (like on a 3rd party driver) - or which takes too much time to release to the async loop, would stall the whole operation.

If possible, you could/should use a Python 3.14 release candidate and a free threading build: that would ensure multiple cores can be used in parallel, and so, even if some code is CPU dependent, parallelism would still be realtime.

Use a 'worker' model and queues to comunicate tasks and results across threads: I'd suggest using concurrent.pools.TrhreadPoolExecutor which already implements this model under the hood, and offers a high level interface - since you won't want for more than one device to be tested on the same thread, here comes an interesting approach:
Create ONE SEPARATE executor instance for each device to test (with 2-3 threads each) - and also a separate executor to handle other I/O you have.

For emphasis, I am rewording the same information and advice:

  • ThreadPoolExecutor "do the right thing" to re-use OS level threads using queues and offering a nice Future/Promise interface - it takes care of the boilerplate and edge cases.
  • The multi-threaded approach will automatically give you concurrency any time one of the threads stops during to wait for I/O.
  • With a free-threading Python build (Python 3.13 on, and officially recognized as "production ready" on Python 3.14t), it also offers CPU intensive pure Python code multicore parallelism (traditionally Python will only run Python code in one thread at a time, even if tehre are several scheduled and idle CPU cores)
  • Using ThreadPoolExecutor with the default examples and docs may (actually will) end up with tasks that sould be executed concurrently scheduled to the same worker, and then you lose the wanted parallelism. Havign a separate Executor per device, and manually calling .submit on the executors so each device runs in a dedicated executor avoids that.
  • Asyncio can offer some simpler patterns, but won't leverage parallel multicore execution even with a free-threaded Python, and any CPU intensive Python code would stop all other code from running in parallel.
  • Although Asyncio can be nice to orchestrate everything described above by the use of loop.run_in_executor calls.

Continuing - I just noticed the "resources that may not be thread safe" in the question: If you have an exclusive one-worker executor to manage the resource that might not be threadsafe (say a DB connector) and only deal with that resource in functions executed in that executor, it should work - for an example, an Sqlite connection should always be used in the same thread it was created - the initializer parameter when instantiating the executor can handle that. . Some libs/resources may not work in a thread that is on the main thread: you may want to use a ProcessPoolExecutor for those instead: the same advantages apply and remarks apply. (Also, a ProcessPoolExecutor could offer you multi-core parallelism even with a 'regular', up to version 3.13 Python (in contrast to a free-threaded build). The main drawback is that all data passed back and forth code run in the executors needs to be de/serialized in across all calls. (Python manages that transparently, though)

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.