陣列應用題計算h-index_Leetcode #274

小松鼠

發佈於演算法題目解析

2024/09/30 更新2023/10/01 發佈閱讀 4 分鐘

這題的題目在這裡

題目敘述

題目給定一個輸入陣列，陣列裡面的數字自分別代表每篇論文的被引用數。

要求我們計算h-index。

h-index的定義:
找一個最大的h值，使得有h篇論文，個別論文的被引用數都大於等於 h

測試範例

Example 1:

Input: citations = [3,0,6,1,5]
Output: 3
Explanation: [3,0,6,1,5] means the researcher has 5 papers in total and each of them had received 3, 0, 6, 1, 5 citations respectively.
Since the researcher has 3 papers with at least 3 citations each and the remaining two with no more than 3 citations each, their h-index is 3.
有三篇論文，引用數都大於等於3
分別是[3, 6, 5] 這三篇

Example 2:

Input: citations = [1,3,1]
Output: 1
有一篇論文，引用數都大於等於1
任選一篇都 >= 1

約束條件

n == citations.length
1 <= n <= 5000
0 <= citations[i] <= 1000

演算法

除了傳統的O( n log n) 的排序演算法的解法以外。

還有O(n) Counting-based的計數排序法的解法。

把被引用數視為一個桶子，每當掃描過一篇論文，就把那篇論文對應到的被引用數累加一。

最後從高引用數掃描到低引用數，累積論文數量，如果當下累積論文數量已經大於等於被引用數，符合定義，則該引用數就是我們的h指標，也就是h index。

註:
實作上有一個空間優化的技巧，因為h index最大值 = 總論文篇數 = n，所以桶子的上邊界設在總論文篇數 n 即可。

程式碼

class Solution:
　def hIndex(self, citations: List[int]) -> int:

　　n = len(citations)

　　# key: citation
　　# value: counting of papaers with specific citations
　　counting_of = [ 0 for _ in range(n+1) ]

　　# Update counting_of table
　　for citation in citations:

　　　if citation >= n:
　　　　# H index is up to n at most
　　　　counting_of[n] += 1
　　　else:
　　　　counting_of[citation] += 1

　　# Compute h index
　　# The given researcher has published at least h papers that have each been cited at least h times.
　　couting_of_at_least_h = 0

　　for citation in range(n,-1,-1):

　　　couting_of_at_least_h += counting_of[citation]

　　　if couting_of_at_least_h >= citation:
　　　　return citation
　　
　　return 0

複雜度分析

時間複雜度:

O( n ) 每個論文掃描一次，引用數也掃描一次，皆為O(n)

空間複雜度:

O(n) h index 最大值頂多和論文總篇數相等，所以counting_of 佔用空間為O(n)

關鍵知識點

由定義出發，想到被引用數可以和Counting sort演算法模板結合。從高引用數掃描到低引用數，累積論文數量，如果當下累積論文數量已經大於等於被引用數，符合定義，則該引用數就是我們的h指標，也就是h index。

Reference:

[1] Python sol by sorting. [w/ Comment] - H-Index - LeetCode

留言

留言分享你的想法！

小松鼠的演算法樂園

96會員

427內容數

由有業界實戰經驗的演算法工程師，手把手教你建立解題的框架，一步步寫出高效、清晰易懂的解題答案。著重在讓讀者啟發思考、理解演算法，熟悉常見的演算法模板。深入淺出地介紹題目背後所使用的演算法意義，融會貫通演算法與資料結構的應用。在幾個經典的題目融入一道題目的多種解法，或者同一招解不同的題目，擴展廣度，並加深印象。

小松鼠的演算法樂園的其他內容

2024/10/06

🆚大小有別根據大小給予序號 Rank Transform of an Array_Leetcode #1331

題目敘述 Rank Transform of an Array 給定一個陣列arr，請根據數字的大小給予序號，序號值介於1~len( set(arr) )之間。最大的數字給予最大的序號。次大的數字給予次大的序號。 ...依此類推最小的數字給予最小的序號1。