古いスクラップ

scrap

一旦の置き場。仕分けがきつい時期にこうなった。
いつか仕分けるかもしれない。

一覧

SQS
- https://dev.classmethod.jp/articles/re-introduction-2022-aws-sqs/
Beanstalk
- https://dev.classmethod.jp/articles/elastic-beanstalk-laravel-deploy/
Graph Neural Network
- https://pytorch-geometric.readthedocs.io/en/latest/
- https://fintan.jp/page/499/
- https://arxiv.org/abs/2108.00955
- https://ai-scholar.tech/articles/gnn/GNN_review
NVIDIA NGC Catalog
- https://www.nvidia.com/ja-jp/gpu-cloud/
Cloud Functions
- https://cloud.google.com/solutions/streaming-data-from-cloud-storage-into-bigquery-using-cloud-functions?hl=ja
tensorflow functional API
- https://keras.io/ja/getting-started/functional-api-guide/
- https://www.tensorflow.org/guide/keras/functional
JAX
- 結局スケールアウトが強みなのかな
- https://techblog.zozo.com/entry/scalable-machine-learning-with-JAX
ivy
- ivyであらゆるフレームワークを統合
- https://github.com/unifyai/ivy
OpenAPI Generator で API Client と型を自動生成した話
- https://devblog.thebase.in/entry/2022/03/28/130016
Feature Storeについてふんわり理解する
- https://www.nogawanogawa.com/entry/feature_store
Tensorflow Recommenders
- https://inside.dmm.com/entry/2022/3/22/engineer-recommend
TRILLsson / CAP12
- ASRの論文。
- https://www.marktechpost.com/2022/03/14/in-the-latest-google-ais-research-the-team-explains-how-they-reduced-the-size-of-the-high-performing-cap12-model-by-6x-100x-while-maintaining-90-96-of-the-performance-in-trillsson-models/
- https://ai.googleblog.com/2022/03/trillsson-small-universal-speech.html
- https://webbigdata.jp/ai/post-12950
VSCodeのGatherでノートブックを整理する
- https://xtech.nikkei.com/atcl/nxt/column/18/01960/022800005/
CSSは確実に進化している！変数、条件分岐、ループ、論理演算など、ロジックに記述するCSSの実装テクニック
- https://coliss.com/articles/build-websites/operation/css/writing-logic-in-css.html
分析に悪影響を与えるデータ欠損、VSCodeのフル活用で解決
- https://xtech.nikkei.com/atcl/nxt/column/18/01960/022400002/
アメリカの中学生が学んでいる 14歳からの数学
- https://diamond.jp/articles/-/298853
- https://www.amazon.co.jp/o/ASIN/4478112177/booksonlinep-22/
Googleのコンピューターサイエンス学習教材「CS First」に日本語教材
- https://forest.watch.impress.co.jp/docs/serial/progedu/1393798.html
マイクロソフトの最新の機械学習研究「μTransfer」を発表。67億パラメータを持つGPT-3モデルを、事前学習時の7%の計算量でチューニングできる新手法
- https://www.marktechpost.com/2022/03/11/microsofts-latest-machine-learning-research-introduces-%CE%BCtransfer-a-new-technique-that-can-tune-the-6-7-billion-parameter-gpt-3-model-using-only-7-of-the-pretraining-compute/
React component design patterns for 2022
- https://blog.logrocket.com/react-component-design-patterns-2022/
BERT 101 🤗 State Of The Art NLP Model Explained
- https://huggingface.co/blog/bert-101
中国の研究者が自然言語理解（NLU）のための新しい事前学習済み言語モデル「PERT」を提案
- https://www.marktechpost.com/2022/03/22/researchers-from-china-propose-a-new-pre-trained-language-model-called-pert-for-natural-language-understanding-nlu/
エンジニアを始めてから便利だったツールまとめ
- https://zenn.dev/nakaatsu/articles/7133e16a0f787c
軽量なWebフレームワークtsoaを使って、OpenAPIとexpressルーティングを自動生成する
- https://zenn.dev/briete/articles/e556424c18e68d
あたらしいテストフレームワークVitestをReactで試してみた | DevelopersIO
- https://dev.classmethod.jp/articles/intro-vitest/
2021年の深層学習ハイライト（研究論文編） - Qiita
- https://qiita.com/shionhonda/items/bf7194c3fa1412755a4c
結局useMemoはいつ使えばいいの？　僕の決定版 - Qiita
- https://qiita.com/uhyo/items/5258e04aba380531455a
合法 TypeScript 第3章 Type の全て
- https://uncle-javascript.com/valid-typescript-chapter3/
デザイナーとフロントエンドエンジニアに知ってほしいWebのフォント周りのお話
- https://zenn.dev/tak_dcxi/articles/588fbc205251043dc357
【JavaScript】実数から整数への変換に parseInt() を使ってはいけない２つの理由:no_女性: - Qiita
- https://qiita.com/yamazaki3104/items/b45ed354a110780aef1f
エムスリー執行役員VPoE兼PdMの山崎が、エンジニア、QA、デザイナー、プロダクトマネージャーにお薦めする良書7選 - エムスリーテ...
- https://www.m3tech.blog/entry/vpoe-book-review-2021
Kaggle Grandmasterになるまでの7年間の軌跡 - のんびりしているエンジニアの日記
- https://nonbiri-tereka.hatenablog.com/entry/2021/12/25/221425
Pythonライブラリinjectorの使い方
- https://wiki.plasticheart.info/python-injector
PythonでDIしてみた話 - Qiita
- https://qiita.com/arata-honda/items/7b0e65ad842d31a9eb84
アルゴリズムの世界地図 - Qiita
- https://qiita.com/square1001/items/6d414167ca95c97bd8b2
2021年、企業が無償公開した新人エンジニア向け研修資料　機械学習やゲーム開発、AWS入門、数学などさまざま（ITmedia NEWS...
- https://news.yahoo.co.jp/articles/66c8aa2aece01355e1d32a9ebb41f1ff4e529b49
モンテカルロ木探索(MCTS; Monte Carlo Tree Search)の概要 - Liberal Art’s diary
- https://lib-arts.hatenablog.com/entry/rl_MCTS_basic
gokart + PyTorch Lightning でいい感じに深層学習モデルを動かす - そぬばこ
- https://nersonu.hatenablog.com/entry/sansan-advent-calendar-2021
設計を歪める認知バイアス - Qiita
- https://qiita.com/MinoDriven/items/8e4abda43b05cd7a0200
DI（依存性の注入）とは依存性を注入するということである、、？ - Qiita
- https://qiita.com/iTakahiro/items/353a11f6c9d2a927158d
小さく始めて大きく育てるMLOps2020 | | AI tech studio
- https://cyberagent.ai/blog/research/12898/
機械学習の煩雑なパラメーター管理の決定版　「Hydra」「MLflow」「Optuna」の組み合わせで手軽にはじめる一元管理
- https://logmi.jp/tech/articles/325087
BERTによる日本語固有表現抽出の精度改善〜BERT-CRFの紹介〜 - Sansan Builders Blog
- https://buildersbox.corp-sansan.com/entry/2021/09/21/120000
クリーンなReactプロジェクトの21のベストプラクティス - Qiita
- https://qiita.com/baby-degu/items/ea4eede60bbe9c63a348
AWSとHerokuの4つの違い｜Herokuの6つの機能・特徴も紹介 | FEnet AWSコラム
- https://www.fenet.jp/aws/column/aws-beginner/805/
https://arrow.apache.org/docs/python/ipc.html
（翻訳）Apache Arrowと「pandasの10項目の課題」 - Qiita
- https://qiita.com/tamagawa-ryuji/items/3d8fc52406706ae0c144
Apache Arrow(PyArrow)を使って簡単かつ高速にParquetファイルに変換する | DevelopersIO
- https://dev.classmethod.jp/articles/20190614-apache-arrow-parquet/
超爆速なcuDFとPandasを比較した - Taste of Tech Topics
- https://acro-engineer.hatenablog.com/entry/2020/12/10/120000
Kaggleで学んだBERTをfine-tuningする際のTips②〜精度改善編〜 | 株式会社AI Shift
- https://www.ai-shift.co.jp/techblog/2145
スケールする要求を支える仕様の「意図」と「直交性」 - Qiita
- https://qiita.com/hirokidaichi/items/61ad129eae43771d0fc3
YOLOv5を使った物体検出 - アルファテックブログ
- https://www.alpha.co.jp/blog/202108_02/
Styled Systemを用いた快適UIスタイリング
- https://zenn.dev/poteboy/articles/ed97328b568acd
計算的機械学習の理論がみっちり書かれてる本，PDFで無償公開されてる
- https://twitter.com/chizu_potato/status/1496433715047956480?s=12
BoW～BERT・その後の展開をまとめた記事
- https://twitter.com/isid_ai_team/status/1495171684051001345?s=12
tensorflowによるtransformerの実装・解説
- https://www.tensorflow.org/tutorials/text/transformer?hl=ja
k-meanの最適なクラス数を求める
- https://di-acc2.com/programming/python/4235/
positional encodingの詳細な解説
- https://kazemnejad.com/blog/transformer_architecture_positional_encoding/
資産形成
- https://twitter.com/sarumon23/status/1506737620889526288
「完全に理解した」を完全に理解できる。
- https://developers.freee.co.jp/entry/understand-of-perfect-understanding
Microsoft社製AutoMLツールFLAML
- https://qiita.com/ozora/items/66b9bfdd1cd2129331d7
CAP12：音声の抑揚を理解する小型で普遍的な音声特徴表現(1/3)
- https://webbigdata.jp/ai/post-12950
軽量Dockerイメージに安易にAlpineを使うのはやめたほうがいいという話 - inductor's blog
- https://blog.inductor.me/entry/alpine-not-recommended
vision transformerのモバイル版？
- https://www.marktechpost.com/2022/03/24/apple-ml-researchers-introduce-mobilevit-a-light-weight-and-general-purpose-vision-transformer-for-mobile-devices/
成長して卒業するために守ってほしい１０のこと
- https://twitter.com/hisashi_is/status/1509750732173737985
https://towardsdatascience.com/feature-engineering-for-time-series-data-f0cb1c1265d3
https://www.marktechpost.com/2022/04/02/researchers-develop-parking-analytics-framework-using-deep-learning/
新しいメンバーがジョインしたときのAWSトレーニング/ハンズオン - Qiita
- https://qiita.com/shu85t/items/00564e29ff8a87e1dbae
HiveやPrestoは分散型SQLクエリエンジン
- https://qiita.com/haramiso/items/122d4ea0e5660e0b4e41
各パブリッククラウドのネットワークに関する違い
- https://www.alpha.co.jp/blog/202007_02
CDKはこの方の記事を抑える
- https://dev.classmethod.jp/articles/cdk-practice-26-version-2/
sweetviz
- EDAツールっぽい。
- https://twitter.com/resistance0108/status/1512608075399991297?s=12&t=n166ajGpVMUd2TVqE3J9jQ
マルチンゲールアプローチ入門
- https://twitter.com/yuki_kaggler/status/1513503895758155782?s=12&t=OL6_YOha7JReQZ1C1rVKug
gaCNN
- CNNのアーキテクチャを遺伝的アルゴリズム(GA) で最適化！
- https://ai-scholar.tech/articles/%E9%80%B2%E5%8C%96%E8%A8%88%E7%AE%97/gaCNN
BERTとベクトル検索を用いたYahoo!ショッピングの製品名寄せ作業の効率化検証
- https://techblog.yahoo.co.jp/entry/2022040630294096/
https://www.marktechpost.com/2022/03/31/stanford-researchers-have-developed-a-machine-learning-based-algorithm-to-detect-autism-in-brain-fingerprints/
データサイエンティスト新卒課題図書6冊を紹介します！ | 白金鉱業.FM
- https://shirokane-kougyou.fm/episode/33
SAP参考
- https://dev.classmethod.jp/articles/aws-all-certifications-and-how-to-study/
- https://dev.classmethod.jp/articles/aws-certified-3-associates-and-sap/
- https://note.com/nabeyakiu/n/n075373919c20

note

NeuralODE

層の方向を連続化してResNetを一つの常微分方程式で表す研究

一般

敵対的学習を使う場合、活性化関数を修正した方が良いかもしれないという話
- https://ai-scholar.tech/articles/adversarial-perturbation/smooth-adversarial-training
Poincare Embeddings
- 異空間への埋め込み！Poincare Embeddingsが拓く表現学習の新展開 - ABEJA Tech Blog
- 双曲空間に埋め込めば非常に低次元で表現できる埋め込みベクトル。

バイオ系

PyMOLが使われることが多いようだ。
- https://hira-labo.com/archives/209
- https://hira-labo.com/archives/1882
- https://hira-labo.com/archives/1544

Kaggle

KaggleのH&Mのレコメンド、良コンペだったという噂が聞こえてきます。
- 候補抽出⇒並べ替えの2stageレコメンドが上位ランク
- 特徴量の作りこみが大事で、NNは微妙だったようだ
- https://yng87.github.io/blog/2022/05/kaggle_hm/
- https://zenn.dev/zerebom/articles/9e6bad764d3f97

時系列データ

SREのためのシステム障害における異常検知
- https://speakerdeck.com/yuukit/sre-next-2022

ライブラリ

cuml
- sklearnをGPUで高速にしたnvidiaのライブラリ
- https://github.com/rapidsai/cuml

CV

ShakeDrop
- https://qiita.com/yu4u/items/a9fc529c85534eca11e5
- パラメータの更新方法に関するもの。

PythonでOCR(Tesseract + PyOCR)

https://rightcode.co.jp/blog/information-technology/python-tesseract-image-processing-ocr

pdftoppm(PDFを画像に変換)

https://atmarkit.itmedia.co.jp/ait/articles/1903/08/news039.html

参考

Alammar氏によるTransformerのvisual解説
- http://jalammar.github.io/illustrated-retrieval-transformer/

TabPFN

https://arxiv.org/abs/2207.01848v3

12層、埋め込みサイズ512、フィードフォワード層の隠れサイズ1024、4頭注目のPFN Transformerのみを考慮しました。我々は、線形ウォームアップとコサインアニーリング（Loshchilov and Hutter, 2017）を備えたAdam optimizer（Kingma and Ba, 2015）を使用した。各トレーニングについて、3つの学習率{.001, .0003, .0001}のセットをテストし、最終的なトレーニング損失が最も低いものを使用しました。結果として得られたモデルは25.82Mのパラメータを含んでいる。

最終的なモデルは、512データセットのバッチサイズで18 000ステップの学習を行いました。つまり、我々のTabPFNは、9 216 000の合成されたデータセットで学習される。この学習は、8つのGPU（Nvidia RTX 2080 Ti）で20時間かかります。各データセットは1024の固定サイズであり、一様にランダムに学習と検証に分割しました。一般に、学習曲線は1000万データセット程度で平坦になる傾向があり、一般に非常にノイズが多いことが確認されました。おそらく、これは我々の事前処理で多種多様なデータセットが生成されるためと思われます。

TabPFNは、学習サイズが1024を超えるデータセットには向いていません。予測に時間がかかったり、信頼性が落ちたりする可能性がある。10kサンプル以上のデータセットを実行しないことをお勧めします。マシンがクラッシュする可能性があります（TabPFNの2次関数的なメモリスケーリングのため）。フィット関数にoverwrite_warning=Trueを渡して、実行するかどうか確認してください。

TabPFNは、合成データセットで事前学習されたオープンソースのTransformerベースのモデルです。TabPFNは、多くの小規模なデータセットにおいて、木ベースのモデルよりもうまく動作することが示されています。現在、1000行未満、100特徴、10クラス以下の小規模データセットでの分類で動作しています。

ARIMA

時系列解析に出てくるARIMAモデルとSARIMAモデルを徹底解説
- https://bigdata-tools.com/arima-sarima-model/

Previousチームマネジメント Nextフロントエンド

Last updated 2 years ago