Blog coding and discussion of coding about JavaScript, PHP, CGI, general web building etc.

Thursday, August 25, 2016

How to improve python import speed?

How to improve python import speed?


This question has been asked many times on SO (for instance here), but there is no real answer yet.

I am writing a short command line tool that renders templates. It is frigged using a Makefile:

i = $(wildcard *.in)  o = $(patsubst %.in, %.out, $(t))    all: $(o)    %.out: %.in      ./script.py -o $@ $<  

In this dummy example, the Makefile parses every .in file to generate an .out file. It is very convenient for me to use make because I have a lot of other actions to trig before and after this script. Moreover I would like to remain as KISS as possible.

Thus, I want to keep my tool simple, stupid and process each file separately using the syntax script -o out in

My script uses the following:

#!/usr/bin/env python  from jinja2 import Template, nodes  from jinja2.ext import Extension  import hiyapyco  import argparse  import re       ...  

The problem is that each execution costs me about 1.2s ( ~60ms for the processing and ~1140ms for the import directives):

$ time ./script.py -o foo.out foo.in  real    0m1.625s  user    0m0.452s  sys     0m1.185s  

The overall execution of my Makefile for 100 files is ridiculous: ~100 files x 1.2s = 120s.

This is not a solution, but this should be the solution.

What alternative can I use?

EDIT

I love Python because its syntax is readable and size of its community. In this particular case (command line tools), I have to admit Perl is still a decent alternative. The same script written in Perl (which is also an interpreted language) is about 12 times faster (using Text::Xslate).

I don't want to promote Perl in anyway I am just trying to solve my biggest issue with Python: it is not yet a suitable language for simple command line tools because of the poor import time.

Answer by silgon for How to improve python import speed?


You could use glob to perform that actions with the files you need.

import glob  in_files=glob.glob('*.in')   out_files=glob.glob('*.out')   

Thus, you process all the files in the same script, instead of calling the script every time with every pair of files. At least that way you don't have to start python every time.

Answer by BPL for How to improve python import speed?


It seems quite clear where the problem is, right now you got:

cost(file) = 1.2s = 60ms + 1040ms, which means:

cost(N*files) = N*1.2s

now, why don't you change it to become:

cost1(files) = 1040ms + N*60ms

that way, theorically processing 100 files would be 7,04s instead 120s

EDIT:

Because I'm receiving downvotes to this question, I'll post a little example, let's assume you got this python file:

# foo.py  import numpy  import cv2    print sys.argv[0]  

The execution time is 1.3s on my box, now, if i do:

for /l %x in (1, 1, 100) do python foo.py  

I'll get 100*1.3s execution time, my proposal was turn foo.py into this:

import numpy  import cv2    def whatever_rendering_you_want_to_do(file):      pass    for file in sys.argv:      whatever_rendering_you_want_to_do(file)  

That way you're importing only once instead of 100 times

Answer by glglgl for How to improve python import speed?


It is not quite easy, but you could turn your program into one that sits in the background and processes commands to process a file.

Another program could feed the processing commands to it and thus make the real start quite easy.

Answer by Chris_Rands for How to improve python import speed?


Insead of using make you could use a workflow manager in python like ruffus or snakemake. Then you can import your script as a module and execute it. This means you only import everything once. The syntax is quite simple in ruffus:

from ruffus import *  import script  @transform(in_files, suffix(".in"),".out")  def first_function(input, output):      script.main(input,output)    pipeline_run(first_function)  

You can then easily add other dependent functions with @follows, the documentation linked above has full details.

If you really want to use make then you could wrap this python code in a makefile.

Answer by Vorsprung for How to improve python import speed?


Write the template part as a separate process. The first time "script.py" is run it would launch this separate process. Once the process exists it can be passed the input/output filenames via a named pipe. If the process gets no input for x seconds, it automatically exits. How big x is depends on what your needs are

So the parameters are passed to the long running process via the script.py writing to a named pipe. The imports only occur once (provided the inputs are fairly often) and as BPL points out this would make everything run faster


Fatal error: Call to a member function getElementsByTagName() on a non-object in D:\XAMPP INSTALLASTION\xampp\htdocs\endunpratama9i\www-stackoverflow-info-proses.php on line 72

0 comments:

Post a Comment

Popular Posts

Powered by Blogger.