博客
关于我
强烈建议你试试无所不能的chatGPT,快点击我
Python Rate Limiter
阅读量:4030 次
发布时间:2019-05-24

本文共 10152 字,大约阅读时间需要 33 分钟。

 Python IRC bot, and it partially works, but if someone triggers less messages than the limit (e.g., rate limit is 5 messages per 8 seconds, and the person triggers only 4), and the next trigger is over the 8 seconds (e.g., 16 seconds later), the bot sends the message, but the queue becomes full and the bot waits 8 seconds, even though it's not needed since the 8 second period has lapsed.

 

8 Answers

134
accepted

Here the simplest algorithm, if you want just to drop messages when they arrive too quickly (instead of queuing them, which makes sense because the queue might get arbitrarily large):

rate = 5.0; // unit: messagesper  = 8.0; // unit: secondsallowance = rate; // unit: messageslast_check = now(); // floating-point, e.g. usec accuracy. Unit: secondswhen (message_received):  current = now();  time_passed = current - last_check;  last_check = current;  allowance += time_passed * (rate / per);  if (allowance > rate):    allowance = rate; // throttle  if (allowance < 1.0):    discard_message();  else:    forward_message();    allowance -= 1.0;

There are no datastructures, timers etc. in this solution and it works cleanly :) To see this, 'allowance' grows at speed 5/8 units per seconds at most, i.e. at most five units per eight seconds. Every message that is forwarded deducts one unit, so you can't send more than five messages per every eight seconds.

Note that rate should be an integer, i.e. without non-zero decimal part, or the algorithm won't work correctly (actual rate will not be rate/per). E.g. rate=0.5; per=1.0; does not work because allowance will never grow to 1.0. But rate=1.0; per=2.0; works fine.

 
2  
It's also worth pointing out that the dimension and scale of 'time_passed' must be the same as 'per', e.g. seconds. –   
Jun 17 '09 at 13:13
2  
Hi skaffman, thanks for the compliments---I threw it out of my sleeve but with 99.9% probability someone has earlier came up with a similar solution :) –   
Jun 19 '09 at 5:02
25  
That is a standard algorithm—it's a token bucket, without queue. The bucket is allowance. The bucket size is rate. The allowance += … line is an optimization of adding a token every rate ÷ per seconds.–   
Jan 26 '12 at 19:32 
4  
@zwirbeltier What you write above is not true. 'Allowance' is always capped by 'rate' (look at the "// throttle" line) so it will only allow a burst of exactly 'rate' messages at any particular time, i.e. 5. –   
Mar 18 '13 at 8:20
4  
This is good, but can exceed the rate. Let's say at time 0 you forward 5 messages, then at time N * (8/5) for N = 1, 2, ... you can send another message, resulting in more than 5 messages in an 8 second period –   
Aug 30 '13 at 16:17 
32

Use this decorator @RateLimited(ratepersec) before your function that enqueues.

Basically, this checks if 1/rate secs have passed since the last time and if not, waits the remainder of the time, otherwise it doesn't wait. This effectively limits you to rate/sec. The decorator can be applied to any function you want rate-limited.

In your case, if you want a maximum of 5 messages per 8 seconds, use @RateLimited(0.625) before your sendToQueue function.

import timedef RateLimited(maxPerSecond):    minInterval = 1.0 / float(maxPerSecond)    def decorate(func):        lastTimeCalled = [0.0]        def rateLimitedFunction(*args,**kargs):            elapsed = time.clock() - lastTimeCalled[0]            leftToWait = minInterval - elapsed            if leftToWait>0:                time.sleep(leftToWait)            ret = func(*args,**kargs)            lastTimeCalled[0] = time.clock()            return ret        return rateLimitedFunction    return decorate@RateLimited(2)  # 2 per second at mostdef PrintNumber(num):    print numif __name__ == "__main__":    print "This should print 1,2,3... at about 2 per second."    for i in range(1,100):        PrintNumber(i)
 
 
I like the idea of using a decorator for this purpose. Why do is lastTimeCalled a list? Also, I doubt this'll work when multiple threads are calling the same RateLimited function... –   
Mar 20 '09 at 20:09
5  
It's a list because simple types like float are constant when captured by a closure. By making it a list, the list is constant, but its contents are not. Yes, it's not thread-safe but that can be easily fixed with locks. –   
Mar 20 '09 at 21:08
 
time.clock() doesn't have enough resolution on my system, so I adapted the code and changed to use time.time() –   
Jun 5 '14 at 19:20
1  
For rate limiting, you definitely do not want to use time.clock(), which measures elapsed CPU time. CPU time can run much faster or much slower than "actual" time. You want to use time.time() instead, which measures wall time ("actual" time). –   
Dec 21 '15 at 23:42
 
BTW for real production systems: implementing a rate limiting with a sleep() call might not be a good idea as it is going to block the thread and therefore preventing another client from using it. –   
Feb 7 at 20:56
19

A Token Bucket is fairly simple to implement.

Start with a bucket with 5 tokens.

Every 5/8 seconds: If the bucket has less than 5 tokens, add one.

Each time you want to send a message: If the bucket has ≥1 token, take one token out and send the message. Otherwise, wait/drop the message/whatever.

(obviously, in actual code, you'd use an integer counter instead of real tokens and you can optimize out the every 5/8s step by storing timestamps)


Reading the question again, if the rate limit is fully reset each 8 seconds, then here is a modification:

Start with a timestamp, last_send, at a time long ago (e.g., at the epoch). Also, start with the same 5-token bucket.

Strike the every 5/8 seconds rule.

Each time you send a message: First, check if last_send ≥ 8 seconds ago. If so, fill the bucket (set it to 5 tokens). Second, if there are tokens in the bucket, send the message (otherwise, drop/wait/etc.). Third, set last_send to now.

That should work for that scenario.


I've actually written an IRC bot using a strategy like this (the first approach). Its in Perl, not Python, but here is some code to illustrate:

The first part here handles adding tokens to the bucket. You can see the optimization of adding tokens based on time (2nd to last line) and then the last line clamps bucket contents to the maximum (MESSAGE_BURST)

my $start_time = time;    ...    # Bucket handling    my $bucket = $conn->{fujiko_limit_bucket};    my $lasttx = $conn->{fujiko_limit_lasttx};    $bucket += ($start_time-$lasttx)/MESSAGE_INTERVAL;    ($bucket <= MESSAGE_BURST) or $bucket = MESSAGE_BURST;

$conn is a data structure which is passed around. This is inside a method that runs routinely (it calculates when the next time it'll have something to do, and sleeps either that long or until it gets network traffic). The next part of the method handles sending. It is rather complicated, because messages have priorities associated with them.

# Queue handling. Start with the ultimate queue.    my $queues = $conn->{fujiko_queues};    foreach my $entry (@{$queues->[PRIORITY_ULTIMATE]}) {            # Ultimate is special. We run ultimate no matter what. Even if            # it sends the bucket negative.            --$bucket;            $entry->{code}(@{$entry->{args}});    }    $queues->[PRIORITY_ULTIMATE] = [];

That's the first queue, which is run no matter what. Even if it gets our connection killed for flooding. Used for extremely important thinks, like responding to the server's PING. Next, the rest of the queues:

# Continue to the other queues, in order of priority.    QRUN: for (my $pri = PRIORITY_HIGH; $pri >= PRIORITY_JUNK; --$pri) {            my $queue = $queues->[$pri];            while (scalar(@$queue)) {                    if ($bucket < 1) {                            # continue later.                            $need_more_time = 1;                            last QRUN;                    } else {                            --$bucket;                            my $entry = shift @$queue;                            $entry->{code}(@{$entry->{args}});                    }            }    }

Finally, the bucket status is saved back to the $conn data structure (actually a bit later in the method; it first calculates how soon it'll have more work)

# Save status.    $conn->{fujiko_limit_bucket} = $bucket;    $conn->{fujiko_limit_lasttx} = $start_time;

As you can see, the actual bucket handling code is very small — about four lines. The rest of the code is priority queue handling. The bot has priority queues so that e.g., someone chatting with it can't prevent it from doing its important kick/ban duties.

 
 
Am I missing something... it looks like this would limit you to 1 message every 8 seconds after you get through the first 5 –   
Mar 20 '09 at 19:12
 
@chills42: Yes, I read the question wrong... see the second half of the answer. –   
Mar 20 '09 at 19:15
 
@chills: If last_send is <8 seconds, you don't add any tokens to the bucket. If your bucket contains tokens, you can send the message; otherwise you can't (you've already sent 5 messages in the last 8 secs) –   
Mar 20 '09 at 19:18
1  
I'd appreciate it if the folks downvoting this would please explain why... I'd like to fix any problems you see, but that's hard to do without feedback! –   
Mar 20 '09 at 22:22
6

to block processing until the message can be sent, thus queuing up further messages, antti's beautiful solution may also be modified like this:

rate = 5.0; // unit: messagesper  = 8.0; // unit: secondsallowance = rate; // unit: messageslast_check = now(); // floating-point, e.g. usec accuracy. Unit: secondswhen (message_received):  current = now();  time_passed = current - last_check;  last_check = current;  allowance += time_passed * (rate / per);  if (allowance > rate):    allowance = rate; // throttle  if (allowance < 1.0):    time.sleep( (1-allowance) * (per/rate))    forward_message();    allowance = 0.0;  else:    forward_message();    allowance -= 1.0;

it just waits until enough allowance is there to send the message. to not start with two times the rate, allowance may also initialized with 0.

转载地址:http://elhbi.baihongyu.com/

你可能感兴趣的文章
苹果Swift编程语言入门教程【中文版】
查看>>
捕鱼忍者(ninja fishing)之游戏指南+游戏攻略+游戏体验
查看>>
iphone开发基础之objective-c学习
查看>>
iphone开发之SDK研究(待续)
查看>>
计算机网络复习要点
查看>>
Variable property attributes or Modifiers in iOS
查看>>
NSNotificationCenter 用法总结
查看>>
C primer plus 基础总结(一)
查看>>
剑指offer算法题分析与整理(一)
查看>>
剑指offer算法题分析与整理(三)
查看>>
部分笔试算法题整理
查看>>
Ubuntu 13.10使用fcitx输入法
查看>>
pidgin-lwqq 安装
查看>>
mint/ubuntu安装搜狗输入法
查看>>
C++动态申请数组和参数传递问题
查看>>
opencv学习——在MFC中读取和显示图像
查看>>
retext出现Could not parse file contents, check if you have the necessary module installed解决方案
查看>>
pyQt不同窗体间的值传递(一)——对话框关闭时返回值给主窗口
查看>>
linux mint下使用外部SMTP(如网易yeah.net)发邮件
查看>>
北京联通华为光猫HG8346R破解改桥接
查看>>