Skip to content

carlospadron/Notes

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Notes

Compilation of notes for different technologies

When to use what for data analysis:

  • Python:
    • the data fits the memory
    • there is a library that does what you need in a efficient manner
    • you manage pandas and geopandas very well
  • SQL
    • the data does not fit the memory
    • you plan to use many other tables already present in the database
    • you plan to write very complex queries
  • Scala/Kotlin
    • the data fits the memory
    • you didn't find a library in Python that does what you need in a efficient way
    • you need to write your own algorithm/solutions in an environment that is fast and capable by nature
  • Rust
    • you need to write your own algorithm/solutions that runs very fast and that you can port to Python as a binding
  • Pyspark
    • your data is massive and needs to be spread on different clusters

About

Compilation of notes

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published