Study/problem solving

[LeetCode] 819. Most Common Word

minzihun 2022. 11. 17. 13:28

 Given a string paragraph and a string array of the banned words banned, return the most frequent word that is not banned. It is guaranteed there is at least one word that is not banned, and that the answer is unique.

The words in paragraph are case-insensitive and the answer should be returned in lowercase.

 

Example:

Input: paragraph = "Bob hit a ball, the hit BALL flew far after it was hit.", banned = ["hit"]
Output: "ball"
Explanation: 
"hit" occurs 3 times, but it is a banned word.
"ball" occurs twice (and no other word does), so it is the most frequent non-banned word in the paragraph. 
Note that words in the paragraph are not case sensitive,
that punctuation is ignored (even if adjacent to words, such as "ball,"), 
and that "hit" isn't the answer even though it occurs more because it is banned.

 

 문자열을 소문자로 변경한 뒤, 구두점을 공백으로 대체, 리스트를 만들어 공백 기준으로 리스트 원소로 추가하고 금지된 단어를 제거한 뒤 남은 문자를 딕셔너리에 키로 넣으면서 반복된 횟수를 값으로 하여 가장 값이 키를 출력되도록 했다.

import re

class Solution:
    def mostCommonWord(self, paragraph: str, banned: List[str]) -> str:
        
        # change upper case to lower case
        paragraph = paragraph.lower()
        
        # remove special words
        paragraph = re.sub(r'[^\w\s]'," ", paragraph)
        
        # prevent exceptional case
        banned.sort()
        banned.reverse()
        
        # remove banned words
        for e in banned:
            paragraph = paragraph.replace(e, "")

        # make a list to split the sentence with space
        paragraph_list = paragraph.split()
        
        # put all elements to dictionary
        cnt_dic = {}
        
        for e in paragraph_list:
            if e not in cnt_dic:
                cnt_dic[e] = 1
            else:
                cnt_dic[e] += 1
        
        return max(cnt_dic, key=cnt_dic.get)

 

  풀이를 보니 리스트 컴프리헨션의 조건절을 이용해 내가 작성한 코드를 한줄에 끝내버렸다. 또 개수를 처리하는 Counter 모듈을 사용해 깔끔하게 표현하였다. most_common의 첫번째 값의 첫 key-value 쌍의 key값을 꺼내어 return 해준다.

import re

class Solution:
    def mostCommonWord(self, paragraph: str, banned: List[str]) -> str:

        words = [word for word in re.sub(r'[^\w]', ' ', paragraph).lower().split() if word not in banned]

        counts = collections.Counter(words)

        return counts.most_common(1)[0][0]