80-90%
of the world's data is unstructured. So why waste time when dealing with structured data. auto-sklearn
will help you use industry-standard techniques to do a preliminary investigation of your dataset so that you can worry about the creative part!
This package will help you perform ML on ANY tabular data. The result of modelling will be a weighted network describing which features were associated with which modelling method. This will help you understand:
- Feature Bias
- Useless Features
- Co-dependency of Features
- PCA efficiency
Apart from the above valuable information, this package requires minimal requirements.txt
. The odds are you probably have these already installed.
To install the package, you can use pip
with the URL of the GitHub repository.
git clone https://github.com/your-username/auto-sklearn.git
cd auto-sklearn
You can set the environment's name as you wish by replacing auto-env
.
python -m venv auto-env
source auto-env/bin/activate # On Windows: auto-env\Scripts\activate
Note that the -e
flag is important.
pip install -e .
import auto-sklearn
To learn intricacies of ML. ML is not a statistical method when dealing with data. I see it as a means to get some result through methodical data morphology. Results and my inference is what makes a good data analysis. To ensure that I am able to see through ML well enough that I can focus in honing inferential skills.
I am developing this project to blaze through basic modelling when I want to:
- Test new datasets.
- Experiment with modelling methods.
- Compare modelling with Deep Learning etc.
Finally, this package will help me write a Data Science blog in a very short time by reducing preliminary data testing time. This inturn helps me focus on novelty and creative aspects of Data Science. I also get to learn new areas of DS in a weekly basis.
You can find my blogs here: bhargavkantheti.com.
TBA