The design of systems that, using vision, allow autonomous robotic grasping and object manipulation is one of the major topics in Robotics and Automation. In presence of unstructured environments and without any prior information about the external environments and about the robot model, the question is how visual sensors can be used to perform autonomous estimation of necessary information about the environment and the object to manipulate, and how an autonomous calibration of the robot model can be performed. The goal of this thesis is to develop a visual-based system, implemented to perform autonomous object recognition and grasping of known objects. The proposed approach resolves the problem implementing two different pipelines on top of a Pose-Based Visual Servoing framework. At first, a Perception Pipeline is implemented, to perform object recognition and segmentation exploiting state-of-the-art machine learning algorithm. The desired end effector pose to actually grasp the object is supposed to be known in advance, based on the knowledge of the target object characteristics. Then, to drive the hand to the desired pose, a Calibration Pipeline has been designed. The proposed approach performs autonomous Camera-Base transformation estimation and kinematic chain calibration solving a non-linear optimization problem. This procedure exploits visual estimation of the end-effector pose to set up the optimization problem. The proposed pipeline is supposed to run before the actual grasping routine. To account for run-time changes of the previously estimated model, an online Internal model calibration procedure has been designed. In the proposed approach, an Extended Kalman Filter is used to calibrate the estimated model, at the beginning of the grasping routine. The framework is then implemented and tested on a reproduction of the AlterEgo humanoid robot arm, using the ROS middle-ware.