In this paper, we design and implement a human–machine interaction application, which enables a visually challenged person to locate and manipulate personal objects in her/his neighborhood. In this setting, we need to develop a tool (embedded in a mobile phone) which is capable of sensing, computing, and guiding the human arm toward the object. This involves solving the following two subproblems: (1) recognition of objects in the input images, and (2) generating control signals, to guide the human for navigation, to reach the desired destination. For the former subproblem, we adapt the bag-of-words framework for recognition and matching on mobile phones. For the latter subproblem, we have developed a moment-based human servoing algorithm which is able to generate commands that help the visually impaired human to localize his hand with respect to the object of interest. All necessary computations take place on the mobile phone. The proposed object recognition and vision-based control design are deployed on a low-/mid-end mobile phone. This can lead to a wide range of applications. With our proposed design and implementation, we demonstrate that our application is effective and accurate, with a high reliability of convergence for different experimental settings. © 2018, King Fahd University of Petroleum & Minerals.