As an Engineer — Optimize Python performance

更新 發佈閱讀 13 分鐘
When I was doing the development of the company’s data dashboard, I often pulled tables from various databases for calculation. API response time decreases when the pull time range is wide or when there are multiple tables. This post documents what I’m trying to optimize the process and other tricks that can improve program performance.

1. Change data storage structure

In addition to data types such as str and int, common Python data storage structures include list, tuple, and dict.

List should be a data structure that anyone who has learned any programming languages will be familiar with. A list is an array. Assume that there are a lot of data stored in the list today, and I need to retrieve specific data, and I have to search all over, as follows.

for i in someList:
print(i)

Time complexity is O(n).

If your data storage today is two-dimensional, such as user data in different countries. The first thing you need to do is to find the country, and then find the data you want, as follows:

for i in someList:
if i == someCountry:
for j in someList[i]:
print(j)

This time yime complexity up to O(n²).

Can tuples vs lists improve performance? As far as I know, there is no clear indication that tuples reduce time complexity. Because the principle of tuple is list, but tuple is a list whose length cannot be changed. When he first creates it, he requests a fixed length from memory. In addition to needing space for initialization and memory, List also needs a temporary storage area for future expansion of data. Therefore, in a tuple of the same data size, the required storage space is smaller than that of a list, and the space complexity is also lower.

Next, I will introduce a powerful function of python, which is dict. dict is short for dictionary. Since it is called a dictionary, there is a relationship between key and value. That is, the value is stored in a unique key. So if you want to query a specific information today, it is as follows:

print(someDict["yourKey"])

Time complexity is O(1).

Even with the example above, the data storage by country is as follows:

print(someDict["someCountry"]["yourKey"])

Time complexity still is O(1).

So in terms of data storage, I prefer Dict > Tuple > List

2. Use Redis

Redis — Powerful caching database. If your data doesn’t need to be updated immediately. (Not the Monitor). Redis is used in many places. When an API pulls a Table from DB to use, other APIs may also need it. At this time, if you have to re-enter SQL or NoSQL commands, the performance will be wasted. If we pull out the data of this table and temporarily store it in Redis, it will take about ten minutes. During these ten minutes, if your other API needs to use the same data, there is no need to re-download the SQL and DB request data. Requesting data from Redis is much faster.

Alternatively, return from your API can also be temporarily stored in Redis. Sometimes the user may refresh the page and resend the request. If these requests need to be recalculated in the background and re-request data from DB. This is very wasteful of performance. If you have saved the returned data in Redis at this time, you can return the result directly without any calculation.

3. Asyn functions for FastAPI

In the FastAPI framework, there is a very novel function, which is Async. Basically equivalent to Node.js’ async. Async allows you to wait for response times, temporarily store state, release thread, and handle other needs.

async def someFunc():
await something

Remember that await can only be used in async functions. I’ve tested it myself so far. If your function does not need the await object, it is recommended to remove async. No need to wait for processing to remove async functions for better performance.

4. Use multiprocessing

The literal meaning of Multiprocessing is easy to understand, that is, multiple process handle different tasks separately. It has been discussed on the Internet that multiprocessing is more efficient than multithreading in python. So far, I haven’t tested it myself, so I can’t draw conclusions. This article first introduces the two multiprocessing methods I currently use.

from multiprocessing import Pool
with Pool(processes=2) as p:
print(p.map(someFunc, [1, 2])

Pool can call some functions in parallel through the map function. Remember that the function must be written in the outermost layer.

from multiprocessing import Process
p = Process(target=someFunc, args=(1,2))
p.start()
#
# The original process continues to run
#
# Wait for p process done
p.join()

The difference between the Process method and the Pool is that after the Process is created, the original Process will continue to execute the program. Pool means that after the original process enters the pool, it will be divided into several processes according to the number of tasks.

Have fun~

留言
avatar-img
留言分享你的想法!
avatar-img
Samuel的沙龍
92會員
129內容數
除了翻譯各國新聞以外,會將過去演講的一些主題內容放上來。閒暇之餘,分享一些PM心得,歡迎參訪。
Samuel的沙龍的其他內容
2025/01/14
哈馬斯與以色列在卡達多哈達成停火協議,首階段將釋放33名人質,為期42天的停火協議將為下一階段全面和平談判鋪路。協議細節包括人質釋放、軍事部署、居民返家、囚犯交換等,但協議執行面臨潛在風險、國內反對聲音及第二階段談判挑戰。協議達成將為加沙地區帶來急需的人道援助,但能否真正實現和平仍有待觀察。
Thumbnail
2025/01/14
哈馬斯與以色列在卡達多哈達成停火協議,首階段將釋放33名人質,為期42天的停火協議將為下一階段全面和平談判鋪路。協議細節包括人質釋放、軍事部署、居民返家、囚犯交換等,但協議執行面臨潛在風險、國內反對聲音及第二階段談判挑戰。協議達成將為加沙地區帶來急需的人道援助,但能否真正實現和平仍有待觀察。
Thumbnail
2025/01/13
探討理性優越感如何偽裝成教導,並提出避免陷入高傲陷阱的方法,包含反思自身動機、多方提問等,以及如何分辨真誠教導與展現優越的差異。
Thumbnail
2025/01/13
探討理性優越感如何偽裝成教導,並提出避免陷入高傲陷阱的方法,包含反思自身動機、多方提問等,以及如何分辨真誠教導與展現優越的差異。
Thumbnail
2025/01/10
臺灣中華電信海底光纜受損事件引發國際關注,疑似中國船隻「順新39號」涉案,凸顯臺灣通訊基礎設施安全及灰色地帶行動的風險。臺灣正積極尋求應對措施,包括強化監控、部署低軌衛星、與國際合作等,以提升網路韌性,並防範未來潛在威脅。
Thumbnail
2025/01/10
臺灣中華電信海底光纜受損事件引發國際關注,疑似中國船隻「順新39號」涉案,凸顯臺灣通訊基礎設施安全及灰色地帶行動的風險。臺灣正積極尋求應對措施,包括強化監控、部署低軌衛星、與國際合作等,以提升網路韌性,並防範未來潛在威脅。
Thumbnail
看更多
你可能也想看
Thumbnail
透過蝦皮分潤計畫,輕鬆賺取零用金!本文分享5-6月實測心得,包含數據流程、實際收入、平臺優點及注意事項,並推薦高分潤商品,教你如何運用空閒時間創造被動收入。
Thumbnail
透過蝦皮分潤計畫,輕鬆賺取零用金!本文分享5-6月實測心得,包含數據流程、實際收入、平臺優點及注意事項,並推薦高分潤商品,教你如何運用空閒時間創造被動收入。
Thumbnail
單身的人有些會養寵物,而我養植物。畢竟寵物離世會傷心,植物沒養好再接再厲就好了~(笑)
Thumbnail
單身的人有些會養寵物,而我養植物。畢竟寵物離世會傷心,植物沒養好再接再厲就好了~(笑)
Thumbnail
不知你有沒有過這種經驗?衛生紙只剩最後一包、洗衣精倒不出來,或電池突然沒電。這次一次補貨,從電池、衛生紙到洗衣精,還順便分享使用心得。更棒的是,搭配蝦皮分潤計畫,愛用品不僅自己用得安心,分享給朋友還能賺回饋。立即使用推薦碼 X5Q344E,輕鬆上手,隨時隨地賺取分潤!
Thumbnail
不知你有沒有過這種經驗?衛生紙只剩最後一包、洗衣精倒不出來,或電池突然沒電。這次一次補貨,從電池、衛生紙到洗衣精,還順便分享使用心得。更棒的是,搭配蝦皮分潤計畫,愛用品不僅自己用得安心,分享給朋友還能賺回饋。立即使用推薦碼 X5Q344E,輕鬆上手,隨時隨地賺取分潤!
Thumbnail
身為一個典型的社畜,上班時間被會議、進度、KPI 塞得滿滿,下班後只想要找一個能夠安靜喘口氣的小角落。對我來說,畫畫就是那個屬於自己的小樹洞。無論是胡亂塗鴉,還是慢慢描繪喜歡的插畫人物,那個專注在筆觸和色彩的過程,就像在幫心靈按摩一樣,讓緊繃的神經慢慢鬆開。
Thumbnail
身為一個典型的社畜,上班時間被會議、進度、KPI 塞得滿滿,下班後只想要找一個能夠安靜喘口氣的小角落。對我來說,畫畫就是那個屬於自己的小樹洞。無論是胡亂塗鴉,還是慢慢描繪喜歡的插畫人物,那個專注在筆觸和色彩的過程,就像在幫心靈按摩一樣,讓緊繃的神經慢慢鬆開。
Thumbnail
複習一下: 我們學習了關於撰寫程式的相關觀念 條件分支(if/else) : 藉由條件分支讓程式執行相對應的功能。 迴圈(while loop ) :程式利用迴圈反覆執行某個區塊的程式碼。 字串處理 (string) : 每個程式都在處理資料,而字串是一種非常重要且常用的資料。 函式(fu
Thumbnail
複習一下: 我們學習了關於撰寫程式的相關觀念 條件分支(if/else) : 藉由條件分支讓程式執行相對應的功能。 迴圈(while loop ) :程式利用迴圈反覆執行某個區塊的程式碼。 字串處理 (string) : 每個程式都在處理資料,而字串是一種非常重要且常用的資料。 函式(fu
Thumbnail
List 清單 和 Tuple元組 清單在Python裡面非常的常用,大家一定要熟練這些基礎的元素。 在Python中,列表(List)是一種常用的資料類型,用於儲存一組有序的元素。列表是可變的(Mutable),這意味著你可以在列表中新增、刪除和修改元素。列表使用方括號 []
Thumbnail
List 清單 和 Tuple元組 清單在Python裡面非常的常用,大家一定要熟練這些基礎的元素。 在Python中,列表(List)是一種常用的資料類型,用於儲存一組有序的元素。列表是可變的(Mutable),這意味著你可以在列表中新增、刪除和修改元素。列表使用方括號 []
Thumbnail
list跟tuple 應用場景跟常用函式:append extend insert remove clear pop del
Thumbnail
list跟tuple 應用場景跟常用函式:append extend insert remove clear pop del
Thumbnail
How to utilize batch input and multi-processing techniques to accelerate feature engineering? 問題 在進行特徵工程的過程中,我們通常需要處理各種各樣的數據,並轉換它們成有意義的特徵,以供後續的模型訓練
Thumbnail
How to utilize batch input and multi-processing techniques to accelerate feature engineering? 問題 在進行特徵工程的過程中,我們通常需要處理各種各樣的數據,並轉換它們成有意義的特徵,以供後續的模型訓練
Thumbnail
我們將會學習 Python 中的數據結構。 主要的數據結構包括列表 (List)、元組 (Tuple)、字典 (Dictionary) 以及集合 (Set)。
Thumbnail
我們將會學習 Python 中的數據結構。 主要的數據結構包括列表 (List)、元組 (Tuple)、字典 (Dictionary) 以及集合 (Set)。
Thumbnail
探索Python學習筆記中列表的建立、存取和常用方法。從使用中括號定義列表到了解索引、新增、刪除、修改等操作,並介紹append、remove、count等常用方法。
Thumbnail
探索Python學習筆記中列表的建立、存取和常用方法。從使用中括號定義列表到了解索引、新增、刪除、修改等操作,並介紹append、remove、count等常用方法。
Thumbnail
陣列是Python語言的最基礎也最容易實作的資料結構,主要可以透過兩種方式在Python上實踐陣列,其中一種是靜態結構 - 串列(List),另一種則是動態結構 - 鏈結串列(Linked List)。 我們會依序介紹這兩種作法如何在Python上執行陣列的相關功能,並比較兩種方法之間的差異。
Thumbnail
陣列是Python語言的最基礎也最容易實作的資料結構,主要可以透過兩種方式在Python上實踐陣列,其中一種是靜態結構 - 串列(List),另一種則是動態結構 - 鏈結串列(Linked List)。 我們會依序介紹這兩種作法如何在Python上執行陣列的相關功能,並比較兩種方法之間的差異。
Thumbnail
在前面的文章中我們學習了關於字典的基本用法,今天再討論更多關於字典的其它用法,以及它和串列、元組等的關聯。
Thumbnail
在前面的文章中我們學習了關於字典的基本用法,今天再討論更多關於字典的其它用法,以及它和串列、元組等的關聯。
追蹤感興趣的內容從 Google News 追蹤更多 vocus 的最新精選內容追蹤 Google News