Hamel Husain·2020年9月1日 09:00·約5分

fastcore: 過小評価されているPythonライブラリ

#Pythonライブラリ #開発者ツール #オープンソース #プログラミング教育 #fast.ai #コード品質

TL;DR

fastcoreはPythonの拡張ライブラリであり、他のプログラミング言語のパターンを取り入れてボイラープレートを削減し、実用的なツールを提供することで、開発者の学習と生産性向上に貢献する。

AI深層分析2026年3月1日 20:45

注目/ 5段階

深度40%

キーポイント

学習方法としてのオープンソース貢献

著者はfastcoreへのドキュメント作成とテスト作成を通じて、コードの深い理解とエッジケースの探索を行い、効率的な学習体験を得た。

fastcoreの目的と特徴

fastcoreはfast.aiプロジェクトの基盤ライブラリであり、Pythonを拡張してボイラープレートを削減し、一般的なタスクに対する有用な機能を追加することを目指している。

他言語からのアイデア導入

Julia、Ruby、Haskellなど多様な言語のパターンをPythonに取り入れることで、開発者が他の言語を学ぶ動機付けとなり、より簡潔で表現力のあるコードを書けるようになる。

実用的なツールセットの提供

fastcoreは開発者が新しい問題を解決し、より効率的にコードを書くための実用的なユーティリティを提供する。

delegatesデコレータによる**kwargsの透明化

fastcoreのdelegatesデコレータを使用すると、関数やクラスの**kwargs引数を明示的な引数としてシグネチャに表示でき、APIの可読性が向上する。

store_attrによるインスタンス属性設定のボイラープレート削除

store_attr関数を使うと、__init__メソッドでのself.attr = attrの繰り返しを避け、簡潔にインスタンス変数を設定できる。

dataclassesとの比較におけるstore_attrの利点

store_attrは継承に依存せず、Python 3.7以上を必要とせず、オブジェクトライフサイクルの任意の時点で使用できる柔軟性がある。

影響分析・編集コメントを表示

影響分析

この記事は特定の技術ライブラリの紹介に留まらず、効果的な学習方法としてのオープンソース貢献の価値を示している。fastcoreのような基盤ライブラリの普及は、AI開発コミュニティ全体の生産性向上と知識共有の促進に寄与する可能性がある。

編集コメント

技術ライブラリの紹介記事だが、学習方法論としてのオープンソース貢献の価値に焦点を当てた点が興味深い。AI開発者コミュニティにおける知識伝承の良い事例と言える。

python

def fit(x, transforms:list):
    "変換を実行した後にモデルをフィットさせる"
    x = compose(*transforms)(x)
    y = [np.mean(x)] * len(x) # これは単純なモデルです。批判しないでください
    return y

# 5未満の要素をフィルタリングし、2を加え、平均を予測する
fit(x=[1,2,3,4,5,6], transforms=[filter5, add2])

composeについての詳細情報

より有用な__repr__

Pythonでは、__repr__メソッドはオブジェクトの公式な文字列表現を提供します。デフォルトでは、これはあまり有益ではありません:

python

class Test:
    def __init__(self, a, b=2, c=3):
        store_attr() # `store_attr`は以前に説明しました

Test(1)

<__main__.Test at 0x7ffcd766cee0>

basic_reprを使用して、より適切なデフォルト表現を素早く提供できます:

python

class Test:
    def __init__(self, a, b=2, c=3):
        store_attr()
    __repr__ = basic_repr('a,b,c')

Test(2)

Test(a=2, b=2, c=3)

デコレータを使ったモンキーパッチ

デコレータでモンキーパッチを行うと便利で、特にインポートしている外部ライブラリにパッチを当てたい場合に役立ちます。fastcore.foundationのデコレータ@patchを使用できます:

python

class MyClass(int):
    pass

@patch
def func(self:MyClass, a):
    return self+a

mc = MyClass(3)

まだ納得いきませんか？次のセクションで、この種のパッチングの別の例をお見せします。

より良いpathlib.Path

これらのpathlib.Pathへの拡張を見れば、もう素のpathlibを使うことは二度とないでしょう！Pathクラスには多くの便利なメソッドが追加されています、例えば:

python

# 従来の方法
with open('somefile', 'r') as f:
    f.readlines()

# fastcoreの拡張後
Path('somefile').read()

詳細はこちらでお読みください。lsメソッドのデモンストレーションです:

python

from fastcore.utils import *
from pathlib import Path

p = Path('.')
p.ls() # 標準のPathlib.Pathではこれは得られません!!

(#7) [Path('2020-09-01-fastcore.ipynb'),Path('README.md'),Path('fastcore_imgs'),Path('2020-02-20-test.ipynb'),Path('.ipynb_checkpoints'),Path('2020-02-21-introducing-fastpages.ipynb'),Path('my_icons')]

待って！何が起こっているのですか？私たちはpathlib.Pathをインポートしただけです。これは@patchデコレータによって可能になっています:

python

@patch
def fun(self:Path):
    return "This is fun!"

p.fun()

これは魔法のようですね！そうでしょう？だからこそ私はこれについて書いているのです！

ラムダを作成するさらに簡潔な方法

python

arr=np.array([5,4,3,2,1])
f = lambda a: a.sum()
assert f(arr) == 15

Selfを使用できます:

python

f = Self.sum()
assert f(arr) == 15

Pandasデータフレームのgroupbyとmeanを行うラムダを作成しましょう:

python

import pandas as pd
df=pd.DataFrame({'Some Column': ['a', 'a', 'b', 'b', ],
                 'Another Column': [5, 7, 50, 70]})
f = Self.groupby('Some Column').mean()
f(df)

Selfについての詳細

ノートブック関数

これらはシンプルですが便利で、コードがJupyter Notebook、Colab、またはIPython Shellで実行されているかどうかを知ることができます:

python

from fastcore.imports import in_notebook, in_colab, in_ipython
in_notebook(), in_colab(), in_ipython()

(True, False, True)

これは、環境に応じて変更または切り替えたい特定の種類の視覚化、プログレスバー、またはアニメーションをコードで表示している場合に役立ちます。

リストのドロップイン置換

Pythonのリストにかなり満足しているかもしれません。しかし、fastcoreのLクラスは、リストのドロップイン置換として機能し、多くの便利な追加メソッドを提供します。

Lを説明する最良の方法は、それを使うことです。リストを定義します（素敵な__repr__をチェックしてください）:

python

p = L(1,2,3)
p

(#3) [1,2,3]

リストをシャッフルします:

python

p = L.range(20).shuffle()
p

(#20) [8,7,5,12,14,16,2,15,19,6...]

リストへのインデックス指定:

python

p[2,4,6]

(#3) [5,14,2]

Lには適切なデフォルトがあります、例えばリストに要素を追加する:

python

p + [100,200]

(#22) [8,7,5,12,14,16,2,15,19,6...]

Lにはもっと多くの機能があります。しかし待って...まだあります！

fastcoreについてもっとお見せしたいことがありますが、ブログ投稿に合理的に収まる方法はありません。このブログ投稿でデモしなかった私のお気に入りの機能のリストです:

基本ユーティリティ

基本セクションには、一般的なタスクを実行するための多くのショートカットや、標準Pythonが提供するものへの追加インターフェースが含まれています。

mk_class: クラスに多くの属性を素早く追加
wrap_class: シンプルなデコレータでクラスに新しいメソッドを追加
groupby: ScalaのgroupByに類似
merge: 辞書をマージ
fasttuple: 強化されたタプル
無限リスト: パディングとテストに有用
chunked: バッチ処理と整理用

マルチプロセッシング

マルチプロセッシングセクションは、以下のような機能を提供することでPythonのマルチプロセッシングライブラリを拡張します:

外部サービスとの競合状態を緩和するための一時停止機能
各ワーカーでバッチ処理を行う機能、例: チャンクで実行するベクトル化操作がある場合

関数型プログラミング

関数型プログラミングセクションは、このライブラリの中で私のお気に入りの部分です。

maps: 関数も合成するマップ
mapped: より堅牢なマップ
using_attr: 属性で動作する関数を合成

変換 (Transforms)

変換は、データ変換と関連パイプラインを作成するためのユーティリティのコレクションです。これらの変換ユーティリティは、このブログ投稿で説明した多くのビルディングブロックの上に構築されています。

さらなる読書

ドキュメントのメインページを最初に読み、次にテストのセクションを読んでドキュメントを完全に理解する必要があることに注意してください。

恥知らずな宣伝: fastpages

このブログ投稿は完全にJupyter Notebookで書かれており、GitHubが自動的にブログ投稿に変換しました！興味がありますか？fastpagesをチェックしてください。

原文を表示

I recently embarked on a journey to sharpen my python skills: I wanted to learn advanced patterns, idioms, and techniques. I started with reading books on advanced Python, however, the information didn't seem to stick without having somewhere to apply it. I also wanted the ability to ask questions from an expert while I was learning -- which is an arrangement that is hard to find! That's when it occurred to me: What if I could find an open source project that has fairly advanced python code and write documentation and tests? I made a bet that if I did this it would force me to learn everything very deeply, and the maintainers would be appreciative of my work and be willing to answer my questions.

And that's exactly what I did over the past month! I'm pleased to report that it has been the most efficient learning experience I've ever experienced. I've discovered that writing documentation forced me to deeply understand not just what the code does but also why the code works the way it does, and to explore edge cases while writing tests. Most importantly, I was able to ask questions when I was stuck, and maintainers were willing to devote extra time knowing that their mentorship was in service of making their code more accessible! It turns out the library I choose, fastcore is some of the most fascinating Python I have ever encountered as its purpose and goals are fairly unique.

For the uninitiated, fastcore is a library on top of which many fast.ai projects are built on. Most importantly, fastcore extends the python programming language and strives to eliminate boilerplate and add useful functionality for common tasks. In this blog post, I'm going to highlight some of my favorite tools that fastcore provides, rather than sharing what I learned about python. My goal is to pique your interest in this library, and hopefully motivate you to check out the documentation after you are done to learn more!

Why fastcore is interesting

Get exposed to ideas from other languages without leaving python: I’ve always heard that it is beneficial to learn other languages in order to become a better programmer. From a pragmatic point of view, I’ve found it difficult to learn other languages because I could never use them at work. Fastcore extends python to include patterns found in languages as diverse as Julia, Ruby and Haskell. Now that I understand these tools I am motivated to learn other languages.

You get a new set of pragmatic tools: fastcore includes utilities that will allow you to write more concise expressive code, and perhaps solve new problems.

Learn more about the Python programming language: Because fastcore extends the python programming language, many advanced concepts are exposed during the process. For the motivated, this is a great way to see how many of the internals of python work.

A whirlwind tour through fastcore

Here are some things you can do with fastcore that immediately caught my attention.

Making **kwargs transparent

Whenever I see a function that has the argument **kwargs, I cringe a little. This is because it means the API is obfuscated and I have to read the source code to figure out what valid parameters might be. Consider the below example:

def baz(a, b=2, c=3, d=4): return a + b + c def foo(c, a, kwargs): return c + baz(a, kwargs) inspect.signature(foo)

Without reading the source code, it might be hard for me to know that foo

def baz(a, b=2, c=3, d=4): return a + b + c @delegates(baz) # this decorator will pass down keyword arguments from baz def foo(c, a, kwargs): return c + baz(a, kwargs) inspect.signature(foo)

You can customize the behavior of this decorator. For example, you can have your cake and eat it too by passing down your arguments and also keeping **kwargs

@delegates(baz, keep=True) def foo(c, a, kwargs): return c + baz(a, kwargs) inspect.signature(foo)

You can also exclude arguments. For example, we exclude argument d

def basefoo(a, b=2, c=3, d=4): pass @delegates(basefoo, but=['d']) # exclude d def foo(c, a, **kwargs): pass inspect.signature(foo)

You can also delegate between classes:

class BaseFoo: def __init__(self, e, c=2): pass @delegates()# since no argument was passsed here we delegate to the superclass class Foo(BaseFoo): def __init__(self, a, b=1, kwargs): super().__init__(kwargs) inspect.signature(Foo)

For more information, read the docs on delegates.

Avoid boilerplate when setting instance attributes

Have you ever wondered if it was possible to avoid the boilerplate involved with setting attributes in __init__

class Test: def __init__(self, a, b ,c): self.a, self.b, self.c = a, b, c

Ouch! That was painful. Look at all the repeated variable names. Do I really have to repeat myself like this when defining a class? Not Anymore! Checkout store_attr:

class Test: def __init__(self, a, b, c): store_attr() t = Test(5,4,3) assert t.b == 4

You can also exclude certain attributes:

class Test: def __init__(self, a, b, c): store_attr(but=['c']) t = Test(5,4,3) assert t.b == 4 assert not hasattr(t, 'c')

There are many more ways of customizing and using store_attr

P.S. you might be thinking that Python dataclasses also allow you to avoid this boilerplate. While true in some cases, store_attr

For example, store_attr does not rely on inheritance, which means you won't get stuck using multiple inheritance when using this with your own classes. Also, unlike dataclasses, store_attr does not require python 3.7 or higher. Furthermore, you can use store_attr anytime in the object lifecycle, and in any location in your class to customize the behavior of how and when variables are stored.↩

Avoiding subclassing boilerplate

One thing I hate about python is the __super__().__init__()

class ParentClass: def __init__(self): self.some_attr = 'hello' class ChildClass(ParentClass): def __init__(self): super().__init__() cc = ChildClass() assert cc.some_attr == 'hello' # only accessible b/c you used super

We can avoid this boilerplate by using the metaclass PrePostInitMeta. We define a new class called NewParent

class NewParent(ParentClass, metaclass=PrePostInitMeta): def __pre_init__(self, *args, **kwargs): super().__init__() class ChildClass(NewParent): def __init__(self):pass sc = ChildClass() assert sc.some_attr == 'hello'

Type dispatch, or Multiple dispatch, allows you to change the way a function behaves based upon the input types it receives. This is a prominent feature in some programming languages like Julia. For example, this is a conceptual example of how multiple dispatch works in Julia, returning different values depending on the input types of x and y:

collide_with(x::Asteroid, y::Asteroid) = ... # deal with asteroid hitting asteroid collide_with(x::Asteroid, y::Spaceship) = ... # deal with asteroid hitting spaceship collide_with(x::Spaceship, y::Asteroid) = ... # deal with spaceship hitting asteroid collide_with(x::Spaceship, y::Spaceship) = ... # deal with spaceship hitting spaceship

Type dispatch can be especially useful in data science, where you might allow different input types (i.e. Numpy arrays and Pandas dataframes) to a function that processes data. Type dispatch allows you to have a common API for functions that do similar tasks.

Unfortunately, Python does not support this out-of-the box. Fortunately, there is the @typedispatch decorator to the rescue. This decorator relies upon type hints in order to route inputs the correct version of the function:

@typedispatch def f(x:str, y:str): return f'{x}{y}' @typedispatch def f(x:np.ndarray): return x.sum() @typedispatch def f(x:int, y:int): return x+y

Below is a demonstration of type dispatch at work for the function f

f('Hello ', 'World!')

f(np.array([5,5,5,5]))

There are limitations of this feature, as well as other ways of using this functionality that you can read about here. In the process of learning about typed dispatch, I also found a python library called multipledispatch made by Mathhew Rocklin (the creator of Dask).

After using this feature, I am now motivated to learn languages like Julia to discover what other paradigms I might be missing.

A better version of functools.partial

functools.partial

test_input = [1,2,3,4,5,6] def f(arr, val): "Filter a list to remove any values that are less than val." return [x for x in arr if x >= val] f(test_input, 3)

You can create a new function out of this function using partial

filter5 = partial(f, val=5) filter5(test_input)

One problem with partial

filter5.__doc__

'partial(func, *args, **keywords) - new function with partial application\n of the given arguments and keywords.\n'

fastcore.utils.partialler fixes this, and makes sure the docstring is retained such that the new API is transparent:

filter5 = partialler(f, val=5) filter5.__doc__

'Filter a list to remove any values that are less than val.'

Composition of functions

A technique that is pervasive in functional programming languages is function composition, whereby you chain a bunch of functions together to achieve some kind of result. This is especially useful when applying various data transformations. Consider a toy example where I have three functions: (1) Removes elements of a list less than 5 (from the prior section) (2) adds 2 to each number (3) sums all the numbers:

def add(arr, val): return [x + val for x in arr] def arrsum(arr): return sum(arr) # See the previous section on partialler add2 = partialler(add, val=2) transform = compose(filter5, add2, arrsum) transform([1,2,3,4,5,6])

But why is this useful? You might me thinking, I can accomplish the same thing with:

arrsum(add2(filter5([1,2,3,4,5,6])))

You are not wrong! However, composition gives you a convenient interface in case you want to do something like the following:

def fit(x, transforms:list): "fit a model after performing transformations" x = compose(*transforms)(x) y = [np.mean(x)] * len(x) # its a dumb model. Don't judge me return y # filters out elements < 5, adds 2, then predicts the mean fit(x=[1,2,3,4,5,6], transforms=[filter5, add2])

For more information about compose

A more useful __repr__

In python, __repr__

class Test: def __init__(self, a, b=2, c=3): store_attr() # store_attr was discussed previously Test(1)

<__main__.Test at 0x7ffcd766cee0>

We can use basic_repr to quickly give us a more sensible default:

class Test: def __init__(self, a, b=2, c=3): store_attr() __repr__ = basic_repr('a,b,c') Test(2)

Test(a=2, b=2, c=3)

Monkey Patching With A Decorator

It can be convenient to monkey patch with a decorator, which is especially helpful when you want to patch an external library you are importing. We can use the decorator @patch from fastcore.foundation

class MyClass(int): pass @patch def func(self:MyClass, a): return self+a mc = MyClass(3)

Still not convinced? I'll show you another example of this kind of patching in the next section.

A better pathlib.Path

When you see these extensions to pathlib.path you won't ever use vanilla pathlib again! A number of additional methods have been added to pathlib, such as:

with open('somefile', 'r') as f: f.readlines()

with open('somefile', 'r') as f: f.read()

Read more about this here. Here is a demonstration of ls

from fastcore.utils import * from pathlib import Path p = Path('.') p.ls() # you don't get this with vanilla Pathlib.Path!!

(#7) [Path('2020-09-01-fastcore.ipynb'),Path('README.md'),Path('fastcore_imgs'),Path('2020-02-20-test.ipynb'),Path('.ipynb_checkpoints'),Path('2020-02-21-introducing-fastpages.ipynb'),Path('my_icons')]

Wait! What's going on here? We just imported pathlib.Path

@patch def fun(self:Path): return "This is fun!" p.fun()

That is magical, right? I know! That's why I'm writing about it!

An Even More Concise Way To Create Lambdas

arr=np.array([5,4,3,2,1]) f = lambda a: a.sum() assert f(arr) == 15

You can use Self

f = Self.sum() assert f(arr) == 15

Let's create a lambda that does a groupby and max of a Pandas dataframe:

import pandas as pd df=pd.DataFrame({'Some Column': ['a', 'a', 'b', 'b', ], 'Another Column': [5, 7, 50, 70]}) f = Self.groupby('Some Column').mean() f(df)

fastcore: 過小評価されているPythonライブラリ

キーポイント

影響分析

編集コメント

関連記事

fastcore: 過小評価されているPythonライブラリ

キーポイント

影響分析

編集コメント

関連記事