We give new and efficient black-box reconstruction algorithms for some classes of depth-$3$ arithmetic circuits. As a consequence, we obtain the first efficient algorithm for computing the tensor rank and for finding the optimal tensor decomposition as a sum of rank-one tensors when then input is a {\it constant-rank} tensor. More specifically, we provide efficient learning algorithms that run in randomized polynomial time over general fields and in deterministic polynomial time over $\mathbb R$ and $\mathbb C$ for the following classes:
\begin{enumerate}
\item {\it Set-multilinear depth-$3$ circuits of constant top fan-in ($\Sigma\Pi\Sigma_{{\sqcup_j X_j}}(k)$ circuits)}. As a consequence of our algorithm, we obtain the first polynomial time algorithm for tensor rank computation and optimal tensor decomposition of constant-rank tensors. This result holds for $d$ dimensional tensors for any $d$, but is interesting even for $d=3$.
\item {\it Sums of powers of constantly many linear forms ($\Sigma\!\wedge\!\Sigma(k)$ circuits)}. As a consequence we obtain the first polynomial-time algorithm for tensor rank computation and optimal tensor decomposition of constant-rank symmetric tensors.
\item {\it Multilinear depth-3 circuits of constant top fan-in (multilinear $\Sigma\Pi\Sigma(k)$ circuits)}. Our algorithm works over all fields of characteristic 0 or large enough characteristic. Prior to our work the only efficient algorithms known were over polynomially-sized finite fields \cite{KarninShpilka09}.
\end{enumerate}
Prior to our work, the only polynomial-time or even subexponential-time algorithms known (deterministic or randomized) for subclasses of $\Sigma\Pi\Sigma(k)$ circuits that also work over large/infinite fields were for the setting when the top fan-in $k$ is at most $2$ \cite{Sin16, Sin20}.